Download Database Notes (full version) - The ELCHK Yuen Long Lutheran

Document related concepts

Relational algebra wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

SQL wikipedia , lookup

Functional Database Model wikipedia , lookup

PL/SQL wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Join (SQL) wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
AS LEVEL
Computer Application
Databases
YLLSS
In the syllabus, we have
Applications of

databases in society
Students should be aware of the uses and applications of databases in
everyday life (e.g. the library system, inventory system in a supermarket,
credit card system, etc.).

Students should be given opportunities to discuss the importance of
databases in business environments and how they are related to the success
of a business.
Concepts and

Students should understand the following terminology and concepts:
terminology
 data and information
 data, fields, records, tables, files and databases
 common data types such as integer, real, character, string, boolean,
date, etc.
 indexes and keys
 database management systems (e.g. data definition language, data
manipulation language, data dictionary, transaction processing
and access control, etc.)
 program-data independence
 data redundancy and data integrity
Basic concepts of a

relational database
Students should know the basic concepts underpinning relational databases
such as entity, relation, attribute, domain, primary key, foreign key,
candidate key, entity integrity, referential integrity, domain integrity, etc.
Students should be able to identify these basic elements in examples taken
from everyday applications.

Students should know how to organise data differently but sensibly in a
relational database and be able to establish the required relationships to link
up the tables.
Creating a relational

database
Database maintenance
and manipulation
Students should be able to create a simple relational database2 based on
specified requirements using SQL.

Students should be able to use SQL to maintain a simple relational database,
manipulate its data or retrieve the required information. They should be able
to:
 modify the structure of the tables
 add, delete and modify the data in the tables
 view, sort and select the contents by filtering
1
 use appropriate operators and expressions such as the in, between
and like operators, arithmetic operators and expressions, comparison
operators and logical operators etc. to perform specific operations
 use simple built-in functions such as aggregate and string functions,
etc.
 perform multiple field indexing and multi-level ordering
 perform queries on multiple tables including the use of equi-join,
natural join and outer join
 perform sub-queries (for 1 sub-level only)
 export query results to, for example, text, html or spreadsheet
format, etc.
The conceptual data

model
Students should understand the importance of good database design in
effective database management. They should be aware of the three levels of
data abstraction; namely conceptual level, physical level and view level.
Entity-Relationship

modeling
Students should be aware of the three types of relationship (one-to-one,
one-to-many, many-to-many) among entities in a relational database.

Students should be able to create simple entity-relationship (ER) diagrams
involving binary relationships only in designing databases for simple
business scenarios. This includes the resolution of many-to-many
relationships into multiple one-to-many relationships in order to implement the
database.

Students should be able to transform the ER diagrams to tables in relational
databases and be able to create a database schema for a given set of data
to describe the characteristics of the database.
Introduction to
Normalisation

Students should be able to briefly explain the meaning and purpose of
normalisation. They should be aware of the methods or measures used to
reduce data redundancy
2
Introduction to Databases
Data vs. Information
Numbers, text, images or any recording in a form that is accessible to human beings are classified
as data. Data themselves have no meaning. It is only when data is interpreted then the data
content will become meaningful. Interpreted data are referred to as information. For example, the
number 33.5 tells us almost nothing. However when readers are told that the number stands for
the temperature in centigrade, the number makes sense to us. In this example, 33.5 is a piece of
data whereas 33.5 as a temperature in centigrade is a piece of information. Information is stored in
computers such that both its data value and interpretation will be recorded. In most cases,
interpretation of computer data is typically given by the corresponding data name.
In the context of databases (which will be elaborated in the next section) as well as in daily use, the
terms “information” and “data” are often used interchangeably although such a kind of confusion is
not desirable. In most cases, the interpretation of the term “data” should be clear from the context of
discussion. In the context of databases, “data” usually means “information”.
The Data Hierarchy
Each information system has a hierarchy of data organization, and each succeeding level in the
hierarchy is the result of combining the elements of the preceding level. The six levels are bits,
characters (bytes), fields (data elements), records, files, and data base (see Figure 1). A bit is a
binary digital which has a value of either 0 or 1. A byte is a composed of 8 ordered bits.
Figure 1.
Hierarchy of data organization.
3
Question to ponder

Are byte and character types the same? (Answer: Not necessarily. This depends on the
underlying encoding scheme being adopted. Even an ASCII character may need more than
one byte to store in certain implementation of Unicode.)
Data Field/Element
A (data) field or data element is the lowest level “logical unit” in the data hierarchy that can be
interpreted in a meaningful way, e.g., “David” for a name, “23469345” for a phone number. The
maximum number of characters (not bytes) that a field can have is called field length. A field
may consist of a single character only, e.g., M(ale) and F(emale) for representing sex. How fine is
the granularity of a field is a user’s decision, e.g., we can treat an address as a single field or as an
aggregate of several fields such as flat-and-floor-number, street-number-and-street-name,
district, city and country, etc. The key concern is the application needs. If certain processing is
required to handle an address at city level, we of course need to divide the address field into its
components.
Record
A record is a logical group of related data fields describing an event or an item, e.g., a student
enrollment record consists of fields such as student-ID, student-name, programme-code,
module-code, date-of-enrollment, etc. A record is the lowest level logical unit that can be
accessed from a file. In other words, if one would like to access a data field within a record, the
whole record has to be retrieved first before the required data field is identified.
File
A logical file is composed of occurrences of records. A physical file is used to refer to a named
area on a secondary storage device that contains a program, a textual material, or even an image.
One logical file is not necessary mapped to one physical file and vice versa. For example, a
logical file may consist of an index area and a record area such that each of the areas is associated
with a separate physical file. End-users are usually concerned with logical files instead of physical
files.
Questions to ponder

Give an example that a physical file may contain more than one logical file.
4

Give an example that a logical file may be stored in multiple physical files.
Data Base
A data base is a collection of files that are logically related and integrated to one another so that
data redundancy is minimized or reduced. Data redundancy exists when a data field is stored in
more than one logical file. Data redundancy often cannot be eliminated entirely for various
reasons but it should be kept under control. Database management system is devised to control
the data redundancy problem by ideally storing every data item once and/or by propagating data
changes to all related record occurrences probably among a number of files so that data integrity
(which concerns the validity, accuracy and correctness of data) can be maintained. Database
management system is often referred to as DBMS, database or database system.
Teaching remark

In many books or online learning resources, the term “data base” is often incorrectly referred to
as “database”. It comes to a stage that people begin to use the two terms interchangeably. In
fact, the ASCA and ALCS Curriculum and Assessment Guides also use the terms
interchangeably.
Need for Storing Persistent Data
Almost all computer applications require some data be kept for describing some inherently stable
properties or up-to-date status of certain items or events. Let us think about the information kept
by a bank for its saving account holders. For each saving account, the bank must store its unique
account number, name(s) of account holder(s), contact address(es) of account holder(s), account
balance, etc., to say a few. Those data are considered to be persistent data as they are not
changed frequently. However some data are more persistent than the others. For example, an
account number should never been changed whereas there is a slim chance that changes would be
required for the name(s) of the account holder(s). Account balance is most susceptible to change
among the pieces of listed data as transactions like money deposit or withdrawal will affect its value.
Obviously the correctness of all recorded persistent data is important to the functioning of the
associated computer applications.
Whether or not a piece of data is persistent varies from application to application. Age may not be
considered as a piece of persistent data as it changes every year for most people. However the age
field is definitely persistent if it appears on a death certificate.
Problems of File Systems
5
Persistent data can be stored in file(s). However there are potential problems with that.
1. Since files are designed to fit individual application needs, a data element may appear in several
files if that piece of data is needed in several applications. For example, a bank customer may
open a saving account and a stock account at the same time. For the stock account, the account
balance is composed of the quantity of each stock purchased. Obviously at least two different
files are needed to keep data for the two types of accounts but data elements such as name(s) of
account holder(s) and contact address(es) of account holder(s), are common. When the customer
moves to a new address, both file are required to be updated. This is caused by the data
redundancy problem. Data redundancy can cause a number of problems during data
modifications; those problems are referred to as data anomalies (which will be detailed later).
2. A consequence of data redundancy problem is integrity problem or data consistency problem.
Data become inconsistent if copies of data are not updated simultaneously.
3. Traditional file systems suffer from sharing problem and security problem. If a new report
which needs to use some but not all data from two files is required, should one be allowed to
access both files? As access control on a file system can only be made at the file level, allowing
someone to read both files implies unnecessary exposure of data. If a new file is created to
store all data needed to produce the new report, data redundancy problem emerges.
4. Structural dependence (also known as program-data dependence) exhibits in file systems.
In order to use a file, a program needs to know the file structure, i.e., details of all data stored in
the file. A change in any file’s structure requires the modification of all programs using that
file.
Aims of Database Systems
The aims of database systems are as follows:




Reduce data redundancy and inconsistency
Separate user data view from physical file structure (see next session for details)
Impose data integrity constraints, e.g., for data validation
Tackle atomicity problem, i.e., all activities in a transaction is either completely performed or
undone. For example, if money is transferred from a saving account to a stock account, the
saving account will be debited whereas the stock account will be credited with the same amount.
The corresponding transaction has to ensure that both data changes are done as one single unit.





Allow concurrent data access
Offer secured data access
Help make data management more efficient and effective
Allow quick answers to ad hoc queries using some query language
Provide end users better access to more and better-managed data
6
Some databases may not be able to achieve all the above aims. Early databases may not support
transaction processing or offer secured data access for concurrent users.
Separating User Data View from Physical File Structure
A key advantage of database is that the end-users and application programmers do not have to know
how data files are organized and stored in the database. This is referred to as the structural
independence (or program-data independence). Thus changing the structure of a file does not
necessarily require computer programs that access the file be modified. Databases achieve
structural independence by organizing data through advanced data structures in which the data
fields and records are related to each other. Computer programs do not access files for data.
Instead, computer programs that need accessing data have to direct their requests to the DBMS
which in turn processes the requests against the data base; in other words, all operations on the data
base are coordinated by the DBMS. Figure 2 describes the interactions between different parties
in a database environment.
Figure 2.
Interactions between various parties in a database environment.
Applications of Databases in Society
Almost all computer applications need to use database to store persistent data.
at least the following data need to be kept.



In a library system,
Library user ID number
Library user name
Library user contact address
7


Maximum number of books that a library user can borrow
Library user ID number, book’s call number and due date of the loan period for each book
which is on loan

Author name(s), publisher, year of publication, and status (e.g, on-loan, on-hold, on-request, and
missing, etc.) of each book.
The above information must be kept in order to support basic library operations like book search,
borrowing and return, etc.
In a supermarket, inventory information needs to be stored so as to facilitate the inventory,
purchasing, marketing and other business functions of the company. Some of the information to
be kept is given below.



Item ID number
Item name (e.g., ABC dental cream)
Item category (e.g., oral hygienic)




Unit price
Stock level
Reorder level (below which an order needs to be placed for replacement)
Reorder amount (i.e., the number of items to be ordered)
In a credit card system, the following information should be recorded.









Card number
Card owner’s name, contact address and phone number
Credit limit
Credit amount used
Card’s expiry date
Card’s date of issue
First issued date
Number of times that the card was reported missing
Number of times of late payment
Databases not only support day-to-day operations of organizations only. Applications can be built
to analyze historical data in databases for planning purpose. Banks use various types of customer
information such as account balances, salary information, saving patterns, credit card repayment
patterns, mortgage repayment patterns to create their customers’ profiles. Customer details like
occupation, age and marital status are recorded too. Such information is stored in databases and
would be analyzed so as to enable the banks to identify potential customers for specific products,
e.g., fund investment and insurance. Such a kind of database applications is known as data
mining which analyzes data in databases to look for data trends or anomalies without the
knowledge of the meaning of the data.
8
The amount of operational data would be too much for management staff to digest. Besides, the
data would be too raw for them to make management decisions. In practice, operational data are
typically summarized (and stored in a data warehouse sometimes) before they are presented to the
management. All mentioned data, no matter in a raw or digested form, are stored in some form of
database.
Types of Databases
There are many ways to classify databases and two of them are listed below.
Number of Users
Many databases designed to run on personal computers are expected to be used by one user at a
time. We usually referred them as single-user databases. Earlier versions of Microsoft FoxPro
and Access belong to such a type.
More sophisticated databases like MySQL, Microsoft SQL Server, IBM DB2 and Oracle are called
multi-user databases as they have built-in facilities for secured and concurrent data access.
Location
A database may be either centralized or distributed. In a centralized database, all database
functions run entirely on a single computer. A distributed database is composed of a set of
partially independent databases running on a group of networked computers that share a common
schema (i.e., an overall design of data base), and coordinate processing of transactions that access
non-local data (Silberschatz et. al., 1997).
Reference
Silberschatz, A., Korth, H.F., & Sudarshan, S. (1997).
Database System Concepts (3rd ed.).
McGraw Hill.
Another form of distributed implementation of databases, more commonly known as client-server
databases, focuses on the distribution of various database functions over multiple computers. In
particular, the database front-end functionality such as input validation is typically done by the
client machines (which are usually personal computers) whereas the back-end functionality like
transaction handling and data base update is provided by server systems, which are typically either
data servers or transaction servers.
9
Data Models
A data model is a collection of logical constructs used to represent data structure, data semantics
and data relationships found within the database. Database models can be conceptual or
implementation oriented. Conceptual data models are used to describe data at the logical and
(user) view levels. It offers no description about the implementation issues. Conceptual models
are often used as a communication tool between database designers and end-users so as to help the
designers understand the data requirements of the end-users correctly. The entity-relationship
model is an instance of conceptual data model. Another type of data model provides a high-level
description of the implementation. Three popular implementation models are hierarchical,
network and relational models. Note that the problem of structural dependence in both
hierarchical and network models is resolved in the relational model.
The key advantages of relational model are as follows:





Structural independence
Improved conceptual simplicity as data are structured in simple-to-understand tables
Easier database design, implementation, management, and use
Ad hoc query capability with the use of the structured query language
Powerful database management system can be built with the system’s complexity being hidden
from the user view
10
Relational Database Concepts
Introduction
In this section, basic relational database terminology and concepts will be introduced. The
definitions and characteristics of entity, relation, attribute, domain and key, etc., are detailed. In
particular, the difference between keys and indexes, and three concepts about data integrity, namely
entity integrity, referential integrity and domain integrity, are explained. In order to help explain
the above terminology and concepts, a problem scenario about a school library is introduced as
below:
The library of XYZ School has decided to computerize its services so as to make them
more efficient and effective.
Since computerization is relatively new to the school, the
library aims to provide only basic library functions to the users initially through the
implementation of a simple computerized library system.
The system is expected to
offer a computerized catalogue of all library items, e.g., books and past examination
papers, and basic circulation functions such as item borrowing, returning and reserving.
Obviously the system needs to keep library user information such as the number of
library items that s/he is allowed to borrow, dates and call numbers of those library
items that s/he has borrowed, or requested, etc.
Library item details such as its call
number, author(s), ISBN, year of publication and status (e.g., available, on loan,
requested and damaged), etc., are also kept.
As a teacher librarian of the school, you are asked to design a suitable database
schema to support the mentioned library operations.
Whenever applicable, examples will be provided in relation to the above problem scenario so as to
provide a clear context for illustrating the database terminology and concepts.
Entity and Entity Set/Type
An entity is a distinguishable object to be described. It can be any object such as a person, a place,
an event or a thing, etc. Entities that share the same properties or attributes are collectively
referred to as an entity set (or entity type). Example entity sets that can be found in a school
environment are students (person), classrooms (place), examinations (event), and subjects (thing),
etc.
11
Entity sets in the XYZ School library example:
o Suppose Linus and Jeff are students, they are entities (library users) because they share
properties of a student and are distinguishable objects in a school library system.
o Library users who may be teachers or students (person), library items (things), circulation
transactions such as a book request (event), and user privilege (things) etc.
Teaching remark (out of syllabus)

An entity set (type) may be further divided into supertype and subtypes if required. In the
XYZ School library example, both teachers and students are classified as library users.
However, it is possible that we need to further divide teachers and students into separate
subtypes for meeting certain application needs. For example, a student library user is required
to be associated with his/her class if the school would like to research into the number of books
borrowed per student from each class. Such an association also helps the teacher librarian to
learn more about the reading habit of various classes of students. Obviously such a sort of class
association does not exist for teachers. Conversely, the library may want to know the number
of times that a teacher does not return borrowed items to the library on time and the cumulative
number of days overdue (as students are fined for late return of library items but it is not always
easy to implement a similar system on teachers). Such a function can help the teacher librarian
to identify those colleagues who do not fully respect the library regulations. Storing those
pieces of information is obviously not necessary for student library users.
The similarity and
differences in the application need for various library users imply a need for a finer
classification among them. For example, common attributes of library users such as library
user ID, name, address, etc., are kept in the supertype whereas non-common attributes of
teacher and student are kept in the corresponding refined entities. The supertype and
corresponding subtypes are structured to form a generalization hierarchy.
In relational database, an entity set is typically represented in terms of one or more relations (a
mathematical term for tables), with each of which being composed of rows and columns. Each
tuple (a mathematical term for rows in a table) in a relation represents an entity of the associated
entity set. Each column, which is uniquely named within the table that it is associated with,
represents a category of information that corresponds to an attribute. A relational database is
typically composed of a number of related tables. Note that the order of the rows and columns
within a table is immaterial to the database.
As shown in the table below, the “user privilege” entity set of XYZ School library example is
composed of 6 rows with each row defining the privilege of a user type for a given material type.
12
column (attribute)
row (tuple)
Table 1.
The “user privilege” table of XYZ School library.
Attributes
Attribute and Domain
Each entity has certain descriptive properties known as attributes (or fields). Some potential
attributes for the student entity are student-name, student-number, and sex, etc.
Attributes in the XYZ School library example:
o student ID, class name (in the “library usesr” entity set)
o call number, material type (e.g., CD-ROM, book), item name (e.g., book title)
The set of all possible values for an attribute is called its domain.
For the student entity set, the
domain of the attribute sex should be {female, male} whereas the domain of the attribute age
should be any positive integer (although it may make more sense by setting an upper bound for the
domain).
Attribute domains in XYZ School library example:
o Domain of “class name”: all valid class names found in XYZ school.
o Domain of “maximum number of library items that a user can borrow”: any
non-zero integer not greater than 10.
The relational database theory does not restrict what data type that an attribute can associate with.
However, some commonly supported data types in relational database are:



Number (integer or real number)
Text (fixed length or variable length)
Boolean type
13

Date and time
Simple vs. Composite Attributes
Attributes that cannot be divided into subparts are known as simple attributes (e.g., age);
otherwise they are composite attributes (e.g., address). Whether there is a need to re-structure
an attribute to finer attributes depends on the application needs. In the XYZ School library
example, the library user name is represented as a composite attribute as it is not further divided
into simpler attributes such as first-name and surname. Such a representation does not cause any
problem as the library does not have any need of processing library information in accordance with
its user’s first-name or surname. To facilitate detailed queries (for the future), many database
designers prefer to change a composite attribute into a series of simple attributes.
Null Attributes
It is possible to use a null as the value of an attribute of an entity. For example, the value of the
ISBN field will be set to null for past examination papers but a valid ISBN is needed for most
books.
Derived Attributes
In some occasions, the value of an attribute can be derived from other related attributes or entities.
Such a kind of attributes is referred to as derived attribute. Suppose a database keeps an
employee table to store employee information like employee-number, employee-name and
number-of-dependents, and a dependent table to record information of each employee’s
dependent in a separate row. In this case, the number-of-dependents attribute in the employee
table is a derived attribute as its value is equal to the number of associated rows in the dependent
table.
In a good database design, integrity constraint (which will be detailed later) should be defined
between derived attributes and their base attributes in order to ensure that an update of the value of
any base attribute will trigger a corresponding update of any associated derived attributes.
Otherwise, data inconsistency will occur.
Intuitively, we should eliminate all derived attributes of a database because their values, if required,
can be computed in real-time. However the use of derived attributes can improve the efficiency of
a database. In the XYZ School library example, it is better to have (derived) attributes to record
the number of times that a teacher does not return borrowed items to the library on time and the
cumulative number of days overdue although those pieces of information can be derived from the
teacher’s circulation records history. The use of derived attributes in this example can greater
14
enhances the database efficiency when compared to rescanning all past circulation records of a
teacher for computing the required information. In this example, the computational effort for
maintaining the integrity of the values of the derived attributes and their base attribute values is
small.
15
Keys

A key is a value of one or more selected attributes used to identify an entity in an entity set.
The concerned attribute(s) is/are known as the key field(s). A potential key field of the
“library user” entity set of the XYZ School library example is the “library user ID” which is
unique for each library user.

A superkey is a set of one or more attributes that, taken collectively, uniquely identify an entity
in an entity set. However, a superkey may contain extraneous attributes. In the “user
privilege” table of the XYZ School library example, all of the following combinations of
attributes are superkeys
o “User type” and “Type of material”
o “User type”, “Type of material” and “Loan period
o “User type”, “Type of material”, “Loan period”, and “Total number of items that can
be borrowed”
o “Description” and “Type of material”
o “Description”, “Type of material” and “Loan period.
Once the values of any of the above attribute combinations are given, we can always uniquely
identify an entity (row) in an entity set (table).
The following attribute combinations are NOT superkeys:
o “User type” and “Description”
o “Type of material” and “Loan period
because giving the values of any of the above attribute combinations, more than one entity (row)
may be identified.
Teaching remarks

The identification of superkeys for a table must be based on the semantics of the attributes
of the table instead of the table content. In “the “user privilege” table of the XYZ School
library example (see Table 2), it appears that giving the values of the “Loan Period” and
“Total number of items that can be borrowed”, a unique entity (row) can be identified and
thus the two attributes, when combined, can be taken as a superkey. However this is
misleading. Suppose school alumni are allowed to use the library and they are allowed to
borrow up to 3 books for a maximum of 14 days. This obviously makes the “Loan Period”
and “Total number of items that can be borrowed” no longer a superkey as a junior student
is also allowed to borrow the same number of books for the same loan period.

In reality, teachers as well as textbooks often use table contents to explain the concept of
key (and normalization, which will be covered later). Teachers must indicate to students
their assumption that the table contents give an exhaustive illustration of the table semantics.
16

Minimal superkeys are called candidate keys. Removal of any attribute in a candidate key
will render the remaining attribute(s) no longer a key. In the “user privilege” table of the XYZ
School library example, all of the following combinations of attributes are candidate keys
o “User type” and “Type of material”
o “Description” and “Type of material”
In the above example, it clearly shows that it is okay for a table to have more than one candidate
key. However multiple candidate keys in a table might imply the existence of transitive
dependency in the table.
Transitive dependency is an indicator of poor database design and
should be avoided. The notion of transitive dependency will be introduced when introducing
the notion of database normalization”.
Teaching remark
 Like superkeys, the identification of candidate keys for a table must NOT base on the table
content, but the semantics of the attributes of the table.

A primary key is a candidate key chosen by the database designer as the major means of
identifying an entity (row) within an entity set (table). No part of a primary key can be null.
Unlike the candidate key, a table can only have one primary key.
Teaching remark
 Some textbooks in the market may have given an imprecise definition of candidate key and
primary key. In one textbook, a primary key is defined as a field or combination of fields
that uniquely and minimally identify a particular record in a table. According to this
definition, it is possible that a table would have more than one primary key but this is
obviously incorrect. The definition given in the book in fact describes a candidate key
rather than a primary key.

Any attribute which is not a part of any candidate key is known as a non-key attribute. In the
XYZ School library example, the loan-period is a non-key attribute.

A foreign key is either null or not a superkey in its own table but a candidate key in another
table. Suppose we have two tables, namely student-subject and subject which store the
subjects that a student has enrolled and the subject description respectively.
The
student-subject table records student-ID (a part of the primary key) and subject-ID (another
part of the primary key) whereas the subject table stores subject-ID (primary key) and
subject-descriptor. The subject-ID in the student-subject table is a foreign key to the
subject table.
17
student-subject
subject
student-ID
subject-ID
subject-ID
subject-descriptor
200425642
CS1132
CS1132
Databases
200425654
CS1132
CS1145
Programming
200425854
CS1145
foreign key to the subject table
Teaching remarks

It is wrong to say the subject-ID in the student-subject table is a foreign key. The notion
of foreign key is defined on two tables.

Many textbooks do not explicitly state that the value of a foreign key can be null.
Indexes

One or more indexes can be defined for a table for efficient data retrieval. Unlike primary key,
an index does not have to be unique. Whether or not an index is required for a table depends
on the application needs. Inclusion or omission of an index in a table definition may affect the
efficiency, but not the functionality, of any data retrieval.

An index is an implementation structure such that given one or more attribute values, relevant
rows can be efficiently retrieved. It is typically implemented through the use of sophisticated
data structures like ISAM and B+ trees.
Common mistakes

Some people may use the terms “index” and “secondary key” interchangeably but this should be
avoided. Keys are logical concepts whereas indexes are implementation concepts. In fact,
there is no notion of “secondary key” or “index” in relational database theory.
Teaching remarks

Most relational databases create an index for the primary key of each table for efficient data
retrieval.

Although indexing can facilitate efficient data retrieval, it should not be overused.
creation and maintenance may involve a lot of computations that take time to finish.
Index
18
Data Integrity
As mentioned before, data integrity is concerned with the validity, accuracy and correctness of data.
In relational database, three type of data integrity are of particular concerns. They are entity
integrity, domain integrity and referential integrity.
Entity Integrity
Entity integrity is a property that ensures that
1. no rows are duplicated, and
2. no attributes that make up the primary key have a null value.
Note that condition 1 must be enforced or a primary key will not be able to uniquely identify an
entity (a row) in an entity set (a table). As an example, the “user privilege” table does meet the
criteria of entity integrity.
Domain Integrity
Domain integrity is a property that ensures that whenever a new data item is entered into the
database, it must be within the domain of the corresponding attribute. For instance, the
enforcement of domain constraint can stop one from entering a value other than “female” or
“male” to the sex attribute.
Referential Integrity
Referential integrity is concerned with the data consistency between coupled tables. In particular,
we may want to ensure that an attribute value that appears in one table also appears for a certain set
of attributes in another table. For example, the XYZ School library database may keep one table
to store library user personal information like user-ID, user-name, and contact-address, etc. and
another table to keep information about loaned books like user-ID, book-call-number, and due
date, etc. The user-ID is the primary key of the library-user-details table whereas the
concatenation of user-ID and book-call-number forms the primary key of the loaned-book table.
The user-ID attribute of the loaned-book table is a foregin key to the library-user-details table (as
user-ID is not a superkey in the loaned-book table but a candidate key in the library-user-details
table). Obviously, it is important to ensure that any value appeared in the user-ID attribute of the
loaned-book table also appears in the user-ID attribute of the library-user-details table.
In relational databases, referential integrity is typically enforced by defining a referential
constraint between a primary key and a foreign key. For referential integrity to hold, any attribute(s)
in a table that is declared a foreign key can contain only values from the primary key attribute(s) of
19
the table that the foreign key relationship is referred to. Thus, deleting a row that contains a value
referred to by a foreign key in another table would break referential integrity. In the XYZ School
library example, this is equivalent to removing a library user from the library-user-details table
without demanding the user to return all books that s/he has borrowed. More examples about
referential integrity can be found here.
It is important to note that a referential constraint may not enable us to avoid errors at the database
design level. The following example illustrates such a problem.
The table on the left stores ID numbers and names of all library users whereas the table on the right
keeps all loaned books. ID and call number are the primary keys of the library user and loan
event tables respectively. user ID in the loan event table is a foreign key to the library user
table. According to the definition of foreign key, it is acceptable to assign a null value to user ID
as found in third record of the loan event table. This obviously does not make sense from a user
perspective to allow a book being loaned to an unknown person but the referential constraint setting
between the two tables does not stop the assignment of null to user ID. To avoid the problem, we
need to make user ID in the loan event table a mandatory attribute.
Teaching remark

SQL92 and SQL99 provides standard features to define constraints for modeling various data
integrity constraints but many commercial database management systems such as Microsoft
Access tend to provide non-standard customized features to serve the purpose.
Such details
are not within the curriculum and will not be further discussed here.
20
Introduction to Database Design Methodology
Three Levels Database Architecture
Database can be viewed at three levels of abstraction, namely conceptual level, physical (or internal)
level and view (or external) level. The key concerns of the three levels are as follows:

View level or external level is concerned with how individual users see the data. Note that a
user is may range from application programmers to casual users who interact with the database
with ad-hoc query facilities. For example, a library user may be interested in the library
collection but not the library user statistics. The librarian would not be expected to have any
interest in the information about individual library user’s reading habit.

Conceptual level is concerned with a community user view of the entire information content of
the database that is of interest to the organization. In this level, no physical consideration is
considered. A change in the internal view to improve performance may not involve any
change in the conceptual view of the database.

Physical level or internal level is concerned with how data is actually stored. Efficiency is the
prime concern at this level. The following aspects, among others, are considered at this level:
1. Data structures chosen, e.g. B-trees, hashing, etc.
2. Access paths, e.g. specification of primary and secondary keys, indexes and pointers and
sequencing.
3. Miscellaneous issues, e.g. data encryption and compression techniques.
Figure 1 outlines the three levels database architecture.
Figure 1.
The three-level database architecture.
The key advantage of the three-level database architecture is that it separates (1) the conceptual
view from the physical view, and (2) the external views from the conceptual view. The former
enables a database designer to provide a logical description of the database without the need to
specify physical structures. This is often called physical data independence. The latter enables a
21
database designer to change the conceptual view without affecting the external views in most cases.
This separation is sometimes called logical data independence.
Readers may click here for a
more detailed discussion of the three levels.
Logical Data Modeling
In order to identify the data need of an organization, logical data modeling is usually applied.
Logical data modeling explores the domain concepts, and their relationships, of a problem
domain. In databases, logical data modeling typically exhibits in the form of entity relationship
modeling. The basic idea is to identify data objects called (logical) entity sets, which are described
by their (data) attributes, and their relationships that meet all data requirements of the concerned
organization, typically expressed in a type of diagram called entity relationship diagrams (ERD).
Logical data modeling, so does entity relationship modeling, may be performed for the scope of a
single project or for the entire enterprise.
Teaching remarks

Different variants of ERD come with different notations and it is important to tell students to
describe any potentially ambiguous ERD notations when answering a question.

For a comprehensive description on data modeling, the Information Technology Services of the
University of Texas has produced an online practical guide to data modeling which is definitely
worth reading. Note that the ERD notations used there are not always consistent with the ERD
notations adopted in this package.
Terminology and Notation of Entity Relationship Modeling
Some of the terminology, e.g., such as entity and attributes, of entity relationship modeling that
readers need to be familiar with have already been covered in the section entitled “Basic
Terminology”. The description below offers some additional information about those mentioned
terms as well as details of those terms that have not given previously. Corresponding ERD notion
used in this package is also shown.
Entity

An entity is a representation of any composite information of a real object (e.g., a bank
customer) or an abstract object (e.g., a money withdrawal transaction of a bank).
o Entities encapsulate data only, i.e., an entity is described only by its associated attributes.
How its attributes will be manipulated is out of the scope of the entity. For example, an
entity about a money withdrawal transaction of a bank is concerned with what amount of
money being taken out from which account on a particular date. How those recorded data
may be used for various purposes are immaterial from the logical data modeling perspective.

Entities may be related to one another, e.g., a bank customer may perform a number of money
withdrawal transactions over a given period.

The ERD notation for entities is a rectangle. A STUDENT entity is represented below.
22
Figure 2.
Notation for entities (rectangle).
Attribute

Attributes define the properties of an entity so as to
o name an instance of an entity
o describe the instance
o make reference to another instance
Example: A school subject is an entity which is characterized by the subject code or
name; a subject also has other attributes such as subject description; a subject may not
be taken unless a student has completed its prerequisite subjects which are objects
themselves

The ERD notation for attributes is an oval. An attribute is linked to the associated entity by a
line or two lines depending whether or not the attribute is a multi-valued attribute. Suppose
the previously mentioned STUDENT entity has two attributes only – name and address. The
corresponding ERD representation is given below.
Figure 3.
Notation for attributes (oval connected to a rectangle with a line).
The above example assumes that every student has exactly one name and one address.
For a
student that has more than one address, the corresponding ERD representation is as follows:
Figure 4.
Notation for multi-value attributes (oval connected to a rectangle with double lines).
If an attribute is the (primary) key or a component of the primary key of an entity, the attribute
name may be underlined.
Assuming each student has a unique name, the corresponding ERD
representation is changed as below.
23
Figure 5.
Notation for multi-valued attributes (oval connected to a rectangle with double lines).
Teaching remarks

Apparently the A/AS Level curricula do not require students to be familiar with how
multi-valued attribute be drawn in an ERD.

In reality, it would be tedious to show attributes of entities in an ERD due to space limitation.
Besides, the attributes associated with a selected entity are usually clear from the context.
Thus attributes of entities are typically omitted in an ERD.
Relationship
Relationships are links connecting to entities that define the relationships of the entities. There may
be more than one relationship between two (or more) entities, e.g., customers open accounts,
customers close accounts in which open and close are relationships between the customer and
account entity sets. Note that an entity may have a reflexive relationship with itself, e.g., the work
supervisor of an employee of a company is also an employee of the company.
Although a relationship can be classified by its degree, cardinality, connectivity, direction, type, and
existence, etc., not all modeling methodologies use all these classifications. This package will
only focus the discussion in degree, cardinality, and existence.
The ERD notation for relationship is a diamond with the name of the relationship as the label of the
shape. The following ERD says a teacher would mark assignment.
Figure 6.
Notation for relationships (diamond shape connected to associated entities).
Another occasionally used notation for relationship is to get rid of the diamond and simply put the
relationship name as a label of the line that represents the relationship. The previous example is
now depicted as follows:
24
Figure 7.
An alternative notation for relationships (line directly connected to associated entities).
Although in most cases relationships are not associated with any attributes, it is possible that
attributes may be required to describe some relationships. Suppose we have a relationship called
borrow which relates the Student and Book. Obviously we need to keep the due date for return for
each book on loan. The information can only be attached to the borrow relationship as it is not an
attribute of Student or Book. In some literature, such a type of relationship is referred to as
associative entity.
Teaching remark

In the Curriculum and Assessment Guide (C&A guide), no associative entity is mentioned.
However, associating attributes to relationship is very common in practice and the concept
should be covered. Although the literature usually introduces a separate notation for
associative entity, the C&A guide does not provide any for it. Having said that, we may
simply associate attribute(s) to a relationship to capture the essence of an associative entity.
So far, all the above examples do not offer us any information to answer the following questions.

Would an assignment be marked by more than one teacher?
maximum numbers of teachers to mark an assignment?

Would a teacher mark more than one assignment? What are the minimum and maximum
numbers of assignments that a teacher needs to mark?

Would there be any teacher who does not need to mark any assignment?
have any unmarked assignment?
What are the minimum and
Is it acceptable to
In order to answer the above questions, we need to know additional properties of the relationship.
Degree
The degree of a relationship is the number of entity sets associated with the relationship. Most
relationships are binary relationship where the degree is two but ternary relationship that involves
three entity sets can be found occasionally, e.g., teachers teach subjects to students. An n-ary
relationship is a relationship with degree n.
Many modeling approaches typically deal with binary relationships only. Ternary or n-ary
relationships are typically decomposed into two or more binary relationships. Thus this e-learning
package focuses its discussion on binary relationship only.
25
Cardinality and Existence (or Modality)
Cardinality defines the actual number of entities that must be included in a relationship.
Cardinality information can be divided into two types – minimum cardinality and maximum
cardinality. Data modeling concerns whether or not the minimum cardinality is zero and whether
or not the maximum cardinality is greater than one, i.e., one (1) or many (n or m), as such
information will affect how a data model is translated into a data schema, i.e., database design.
Existence or modality denotes whether the existence of an entity instance is dependent upon the
existence of another related entity instance. The existence of an entity in a relationship is defined as
either mandatory if every instance of the entity involves in that relationship. Otherwise, the
existence of an entity in a relationship is defined as optional. It is clear that the minimum
cardinality of an entity that has an optional existence must be zero. Conversely, a mandatory
existence of an entity in a relationship implies that the minimum cardinality of the entity in the
relationship is a positive integer.
The following examples are devised to illustrate the above concepts.
Example 1 - Man is-married-to Woman
The minimum cardinality and maximum cardinality of the relationship are 0 and 1 respectively as a
man (or woman) may be married to no woman (or man) and the maximum number of women (or
men) that a man (or woman) can be married to is one.
Obviously, both entities (Man and Woman) optionally participate in the marry relationship. Thus
the existence of both entities in the relationship is optional. In ERD, a small circle is added on the
line that joins an entity and a related relationship if the existence of the entity in the relationship is
optional. An ERD that represents the connectivity and existence information of the marry
relationship is given below.
Figure 8. The “Man is-married-to Woman” scenario.
26
Example 2 - Mother give-birth-to Child
As a mother may give birth to at least one child, the corresponding ERD representation is as
follows:
Figure 9.
The “Mother give-birth-to Child” scenario.
The minimum cardinality and maximum cardinality of the relationship are 1 and n (where n is a
positive number) respectively as a mother must have one or more children. Regarding the
existence of the entities in the relationship, both Mother and Child must involve in the relationship
as every mother must have at least a child whereas every child must have a mother.
Example 3 - Teacher teach Student
Assuming a normal school setting in which all teachers and students are involved in the teach
relationship, the relationship is of the many-to-many (m:n) type as many teachers would teach a
student whereas a teacher would teach many students. According to the assumption, the minimum
cardinality of the relationship is one. The maximum cardinality of the relationship is many. The
corresponding ERD representation is as follows:
Figure 10.
A “Teacher teach Student” scenario.
27
If there exists some teacher who is taking a study leave and thus does not teach any student, the
above ERD will become:
Figure 11.
An alternative “Teacher teach Student” scenario.
The last example shows that an ERD can be correctly constructed only if all data requirements are
collected. Any missing requirement may result in an inaccurate data model, which in turn would
mislead a database designer to create an incorrect data schema. Thus, it is important for a database
designer to confirm with the end-users that all data requirements are correctly captured, typically
with the use of ERD as a communication tool. To enable an effective communication between the
two parties, the database designer must teach the end-users how to read an ERD.
Developing Entity Relationship Model
Steps in Developing Data Model
There is no standard way as to how a data model should be built. Typically, entities and
relationships are modeled first, followed by key attributes, then non-key attributes. As an example,
the steps described by the Information Technology Services of the University of Texas in its online
practical guide to data modeling are listed below.
1. Identification of data objects and relationships
2. Drafting the initial ER diagram with entities and relationships
3. Refining the ER diagram
4. Adding key attributes to the diagram
5. Adding non-key attributes
6. Diagramming generalization hierarchies
7. Validating the model through normalization
8. Adding business and integrity rules to the model
Although the steps are presented in a linear manner, the process of database design is usually
iterative, i.e., some steps may need to be repeated before a final design results. This note will only
cover the first three steps. Steps 4-5 are straightforward to follow whereas Steps 6-8 requires a
28
more elaborated discussion which is definitely out of the scope of the current curricula of the A/AS
level computer subjects. In order to explain how a data model can be developed in accordance
with the suggested steps, a problem scenario about a bookstore is given below and illustrations in
light of the example will be given as far as possible.
ABC Bookstore is planning to automate its inventory, enquiry, sales and purchasing
functions by introducing a database management system.
keep track of the stock level of each book title.
The inventory system will
The sales system will keep track of the
details of each sales order (which is supposed to be of cash sales type only).
A sales
order may involve multiple titles of any given quantities. When the inventory of a title
drops below a re-order level, a pre-determined re-order quantity for that title must be
ordered from the supplier of that book title.
Each book title is assumed to be supplied
by one publisher only and a publisher may supply multiple book titles.
At the end of each day, the purchasing system will be run to compile a number of
purchase orders detailing the book titles, quantities needed from each publisher.
Note
that all book titles to be re-ordered from the same publisher must be grouped into a
single order.
sale.
Sales details will be removed from the sales system 6 months after the
Details of purchase orders will be removed from the purchasing system 6 months
after the purchase orders are fulfilled.
the orders are delivered.
A purchase order is fulfilled when all the items in
For simplicity, we assume no partially fulfilled orders.
Concerning the enquiry function, the database should support enquiries based on author
name and book titles.
Teaching remarks

Developing an ERD from a problem description is not easy at all and it requires a lot of
expertise. Many learners find it difficult to learn the skill because without any expert’s advice
or comment, they do not know whether the ERD that they have developed is correct or not.
Thus teachers must be prepared to give a lot of feedback to students when teaching the topic.

To help student to learn the skill, give them very simple problems (that can be described in no
more than three sentences) to start with.
Identification of data objects (entities) and relationships
Developing an ERD typically begins with a general description of the organization’s operations and
procedures obtained during the requirements analysis. The purpose is



to classify data objects as entities or attributes
to identify relationships between entities
to name and define identified entities, attributes, and relationships
29
While it is easy to define the basic construct of the ER model, it is not easy to distinguish their roles
in building the data model. Should a data object be modeled as an entity or attribute? In the ABC
Bookstore example, apparently a book title has attributes like author(s), ISBN, publisher, and year
of publication, etc. It is also possible to model author as a separate entity. The correct answer
usually depends upon the requirements of the data base. Generally, the following guidelines are
adopted.

Entities contain descriptive information and they represent many things which share properties.
It is unlikely that an entity set/type would associate with no description information or have one
instance only.

Attributes identify (i.e., an identifier), describe entities, or make reference to other entity
instances.

Relationships are associations between entities.
In order to identify all potential entities and attributes, all nouns (or noun phrase) in the problem
description are singled out. Both entities and attributes tend to be associated with those descriptive
noun phrases.
If there is no descriptive information associated with a noun phrase, it is unlikely
to be an entity.
Nouns/noun phrases (in the ABC Bookshop example)
ABC Bookstore
inventory
enquiry
sales
purchasing functions database management system
inventory system
stock level
book title
sales system
details
sales order
cash sales type
re-order level
re-order quantity
supplier
publisher
day
purchasing system
purchase orders
quantities
fulfilled orders
enquiry function
author name
6 months
As we will be able to see soon, some of the above nouns/noun phrases are in fact irrelevant whereas
some additional data not appeared in the problem description are needed to be added to the data
model.
Several guidelines can help learners identify candidates of entities and attributes.

It is unlikely that an entity set/type would associate with no description information or
have one instance only. For example, there is only one instance of ABC Bookstore. It is
thus unlikely to be an entity set/type. It is not an attribute too. In fact the bookshop offers a
context for the problem scenario and all entities and relationships are under its umbrella.
Another example is “database management system”.
30

Some general terms like “system” can usually be safely removed while some other general
terms like “details” may need to be elaborated.

A problem description may not be complete. Some data that need to be modeled may be
omitted. It is important for the learners to detect such a kind of omission and put the
omitted data objects back to the data model. For example, the dates of the sales and
purchase orders have never been mentioned explicitly in the problem description but it is clear
that they must be kept in the database. Another omission is publisher’s details like contact
information.

Some descriptions may be related to the processing aspect instead of the data aspect of the
application and they can be safely skipped when developing a data model. For example,
the second paragraph of the problem description gives details of the processing requirements,
i.e., how data should be processed to give results that users want. Basically what it says is that
programs need to be run (1) to support the enquiry function; (2) to produce purchase orders; and
(3) to remove old purchase and sales orders details from the database. The three mentioned
functions rely mostly on data already stored in the database and require only a few new data to
support those functions, e.g., purchase order fulfillment date.
In reality, end-users are often approached by database designers to clarify data requirements when
developing a database.
The entities identified from the problem description are




Book
Publisher
Sales order
Purchase order
Their attributes are

Book – ISBN (unique for each book), book title, author(s), unit price, stock level, re-order level,
re-order quantity


Publisher – Publisher name (unique for each publisher), address, phone.
Sales order – sales order number, sales order date, sales order amount, (for each book sold)
ISBN, unit price, quantity.

Purchase order – purchase order number, purchase order date, purchase order amount, order
fulfillment date, (for each book sold) ISBN, quantity.
It is possible that different people may come up with a slightly different set of entities and attributes
even they all work on the same problem. In the ABC Bookstore example, one may decide to store
stock level, re-order level, and re-order quantity of each book title as a separate entity. Such a
proposal is also acceptable and will result in a slightly different ERD at the end. However the data
schema derived from both ERDs will be the same as we will demonstrate later.
31
Teaching remarks

Try not to judge the correctness of a list of entities (and perhaps attributes too) from a problem
description that the students pass to you too soon as it will be difficult to know whether or not
their answer is correct without examining the whole ERD.

It is a good idea to identify potential entities and attributes before proceeding to the
identification of relationships. Relationships link entities and thus we can focus to find
verbs/verb phrases that link the potential entities.
Verbs/verb phrases (in the ABC Bookshop example)
Many printed and online resources would suggest identify potential relationships by identifying
verb (phrases) from the problem description. However such a method does not work well in many
cases.
For example, some verbs or verb phrases that we have identified from the problem
description are as follows:
… planning to automate its …
… introducing a database management system …
… keep track of the details of …
… is supposed to be …
… may involve multiple titles …
… drops below a re-order level …
… must be ordered …
… is assumed to be …
… supply multiple book titles …
… run to compile …
… grouped into …
… will be removed from …
… are fulfilled …
… are delivered …
It is not easy to see how they can hint at the identification of valid relationships. We propose the
following steps to identify relationships and they are found to be particularly useful in dealing with
small problems.
1. Identify all potential entities first.
2. Exploit any possible relationship between each pairs of the entities by cross-referencing them
with the problem description. (Only binary relationships are considered.)
3. Read the problem description and see whether the identified entities and relationships can
capture all the users requirements described in the problem description. If not, go back to
Steps 1.
32
Earlier on, we have identified four entities: Book, Publisher, Sales order, and Purchase order.
The potential relationships among them are as follows.

is-included-in – Book is-included-in Sales order



is-published-by – Book is-published-by Publisher
is-referred-in – Publisher is-referred-in Purchase order
is-specified-in – Book is-specified-in Purchase order
No obvious relationship can be identified between Sales order and Purchase order, and Publisher
and Sales order.
Drafting the initial ERD with entities and relationships
The initial ERD aims to provide a pictorial representation of the major entities, and the relationships
between them. Cardinality of each relationship is required to be shown. The initial ERD for the
ABC Bookstore example could be as follows:
Figure 12.
An initial ERD for ABC Bookstore.
Figure 13 gives the initial ERD if the inventory information of book title is modeled as a separate
entity.
33
Figure 13.
An alternative initial ERD for ABC Bookstore.
No attributes are shown in the ERD above for simplicity. In practice, details of entities are shown
in a separate document called data object description. The document typically contains the name
of each entity and purpose, name and data type of each attribute for every entity, as well as the
attribute characteristics such as whether its value is unique and/or mandatory, etc.
Refining the ER diagram
Check whether the initial ERD meets any users requirements specified in the problem description.
If not, identify the inadequacy and propose new entity, attributes and/or relationships and redraw
the ERD. For example, one may leave the order fulfillment date in the Purchase order entity in
the initial ERD but such as omission can be identified when checking whether the initial ERD be
able to meet the users requirements specified in the problem description. The ERD given in
Figure 12 (or the one in Figure 13) appears to be able to meet all users requirements and thus will
not be refined further.
34
Converting ERD to Database Tables
ERD is a result of data analysis and it must be used in the data design process to help generate data
schema. A basic 3-rule conversion process can be applied to translate an ERD into a data schema
that meets the criteria of the third normal form (which will be detailed later). We refer the
conversion process to as the basic conversion process.
Basic Conversion Process
The three rules in the process are as follows:
1. For a 1:1 cardinality relationship, all the attributes of the related entities are grouped into a
single table.
2. For a 1:n cardinality relationship, model each of the related entities in a separate table and post
the primary key of the “one” side entity as an (foreign key) attribute to the table that represents
the “many” side entity.
3. For an m:n cardinality relationship, model each of the related entities in a separate table and
create a new table (which is referred to as the intersection table) and post the primary key of
each entity set/type as an attribute in the new table. If the relationship has its own attributes,
those attributes are to be stored in the intersection table too. The primary key of the
intersection table is a composite key which includes the primary key of each concerned entity
type.
Example 1 – 1:1 Relationship
In the ABC Bookstore example, if an Inventory entity is introduced for representing the inventory
information of book title (Book), we will have the following relationship.
Figure 14.
The Book is-associated-with Inventory relationship.
The relationship indicates that each book title is associated with exactly one piece of inventory
information and vice versa. Since the relationship is of 1:1 type, all attributes of the entities will
be stored in the same table according to the first rule of the basic conversion process. As a result,
the attributes to be stored in the resultant table will be exactly the same as the table corresponding to
the Book entity in the original ERD. They are ISBN (unique for each book), book title, author(s),
unit price, stock level, re-order level, and re-order quantity. This explains why various ERDs may
lead to the same data schema.
35
For ease of reference, the attributes of a table are shown in the following notation.
TableName(key-attribute1,
…, key-attributeN, other-attribute1, other-attribute2, ….)
The attributes of the Book table are given below.
Book(ISBN, book_title, author, unit_price, stock_level, re-order_level, re-order_quantity)
Note that all author names of a book title are assumed to be stored in the author field. Besides,
more attribute(s) will be added to the above Book table as we deal the relationship between the
Book and Publisher entities.
Example 2 – 1:n Relationship
In the ABC Bookstore example, we have the following relationship that links the Publisher and
Purchase order entities.
Figure 15.
The Publisher is-referred-in Purchase order relationship.
The relationship indicates that a publisher may be associated with any number of purchase orders
(zero to many) whereas each purchase order is associated with exactly one publisher (as each
purchase order will only be placed to one publisher). According to the second rule of the basic
conversion process, the primary key (or identifier) of the Publisher entity must be posed to the table
that represents the Purchase order entity. The resultants tables for representing the relationship
will be as follows:
Publisher(publisher_name, address, phone)
Purchase_order(purchase_order_number,
order_fulfillment_date)
purchase_order_date,
purchase_order_amount,
Note that the ISBN and quantity of each book title being specified in a purchase order are excluded
from the Purchase_order table as there exists an m:n relationship between the purchase order and
book title entities. Such attributes need to be housed in a separate table as illustrated in the next
example.
Example 3 – m:n Relationship
36
In the ABC Bookstore example, we have the following relationship that links the Book and
Purchase order entities.
Figure 16.
The Book is-specified-in Purchase order relationship.
The relationship indicates that a book title may be associated with any number of purchase orders
(zero to many) whereas each purchase order is associated with at least one book title. According
to the third rule of the basic conversion process, the primary keys of both the Book and Purchase
order entities must be posed to a new table, i.e. the intersection table, to link to the tables that
represent the concerned entities. The resultants tables for representing the relationship will be as
follows:
Book(ISBN, book_title, author, unit_price, stock_level, re-order_level, re-order_quantity)
Purchase_order(purchase_order_number,
purchase_order_date,
purchase_order_amount,
order_fulfillment_date)
Book_in_Purchase_order(purchase_order_number, ISBN, ordered_quantity)
Example 4 – Data schema for the ABC Bookstore Example
After applying the 3 rules specified in the basic conversion process, we can obtain the data schema
for the ABC Bookstore example as follows:
Book(ISBN, book_title, author, unit_price, stock_level, re-order_level, re-order_quantity,
publisher_name)
Publisher(publisher_name, address, phone)
purchase_order_date,
purchase_order_amount,
Purchase_order(purchase_order_number,
order_fulfillment_date)
Book_in_purchase_order(purchase_order_number, ISBN, ordered_quantity)
Sales_order(sales_order_number, sales_order_date, sales_order_amount)
Book_in_sales_order(sales_order_number, ISBN, quantity_sold, unit_price)
Note that the Book table has been added with a new field, publisher_name, after considering the
relationship between the Book and Publisher entities. This illustrates that the definition of a table
will not be finalized until all relationships connected to the entity concerned are considered.
37
Drawback of Basic Conversion Process
The basic conversion process does not guarantee that null attribute values are minimized and
problems may occur for entities with optional occurrences. Suppose a school has a number of
lockers at different buildings and each student is entitled to have one locker on request. Due to the
uneven demand of lockers at different buildings, some lockers are unused whereas some students
are assigned to no locker. The relationship is given below.
Figure 17.
The Student is-assigned-to Locker relationship to illustrate
the drawback of the basic conversion process
Since the relationship is of 1:1 type, we may put all attributes of the two entities together into one
single table.
Assuming the attributes of the Student and Locker entities are:

Student – student_ID, student_name, programme_enrolled.

Locker – locker_ID, building, floor.
Two possible table structures can be developed as below.
Student(student_ID, student_name, programme_enrolled, locker ID, building, floor)
Locker(locker_ID, building, floor, student_ID, student_name, programme_enrolled)
Both of the above table structures are problematic. In the first table structure, lockers that are not
assigned to any students cannot be represented. In the second table structure, students that are not
assigned to any lockers cannot be represented.
Optional-max Conversion Process
The problem illustrated in the last example can be overcome by introducing another rule to the basic
conversion process and we refer the augmented process to as the optional-max conversion process.
The rules to be applied in the new process are as follows:
1. For every instance where the lower cardinality bound is zero and the upper cardinality bound is
one, temporarily label the upper cardinality bound of as n, i.e., many.
2. Apply the basic conversion process as usual.
After applying rule 1 of the optional-max conversion process, the relationship becomes
38
Figure 18.
The Student is-assigned-to Locker relationship after the
first rule of the optional-max conversion process is applied
Now the relationship is considered as of an m:n type and will be modeled by three tables according
to the third rule of the basic conversion process. The resultant tables are:
Student(student_ID, student_name, programme_enrolled)
Locker(locker_ID, building, floor)
Assign(student_ID, locker_ID)
With the new table structures, details of both empty lockers and students who are not given any
lockers can be represented.
Teaching remark

One may suggest handling the relationship as 1:m type. This will result in two tables, either
with the student_ID posed to the Locker table or the locker_ID posed to the Student table. The
proposed table structures can represent empty lockers and students who are not given any
lockers too. However the proposal will result in null entries in at least one table. In situations
that involve associative entity (i.e. relationship with attribute), more null entries would be
resulted. For example, if the date that a locker is assigned to a student is to be recorded, the
field will be null for an unassigned locker should we treat the relationship as 1:m type. The
optional-max conversion process provides a more resilient solution to the problem as the date
field will be kept in the intersection table, i.e., the Assign table.
39
Introduction to Normalization
Normalization is a database design technique based on analyzing relations among key and non-key
attributes of database tables. This technique includes a series of rules or steps to normalize the
database into a number of tables depending on the degree of normalization that one wants to
achieve. The database design compliant to those rules correspond to a specific normal form such as
first normal form (1NF), second normal form (2NF) and third normal form (3NF), …, etc. Despite
the existence of higher normal forms, only the 1NF, 2NF and 3NF will be covered. Higher normal
forms imply a data schema with more tables and querying such a database would involve more
efforts in “joining” tables together. In practice, most database designers generate data schemata
normalized to 3NF in order to strike for a balance between maintainability and efficiency. Readers
who are interested to have an overview of various normal forms (from 1NF to 6NF) may visit
Wikipedia’s page on database normalization.
Why Normalization
The main purpose of normalization is to minimize data redundancy and anomalies. In the following
section, we will show the problem of data redundancy and update anomalies through a problem
scenario.
Data Anomaly
Data anomaly refers to the unexpected phenomena that occur when updating a database that
exhibits data redundancy. There are several types of data anomaly – insertion, deletion and
modification (or update) anomalies.
Insertion Anomaly
Could we record insertion of some data object of interest in a table?
addition anomaly.
If no, the table suffers from
Deletion Anomaly
Could we record deletion of some data object of interest in a table without losing any information?
If no, the table suffers from deletion anomaly.
Modification Anomaly
Would an update in one attribute’s value be recorded in a table more than once?
suffers from modification anomaly.
If yes, the table
40
Functional Dependencies
In order to understand why data anomalies exist, we need to understand the concept about
functional dependencies. Functional dependencies are used to describe the dependency between
the attributes within a table.
Given A and B are attributes of the same table, the attribute B is functionally dependent on the
attribute A if each value of A is associated with one and only one value of B. The notation to
represent the above notion is A B. It may be read as A determines B.
Suppose A is a composite attribute. Attribute B is said to be full functionally dependent on attribute
A if B is functionally dependent on A and not functionally dependent on any proper subset of A.
If B is functionally dependent on some proper subset of A, B is said to be partially dependent on A.
Teaching remarks

Some textbooks and online resources on database may define full functionally dependency as
follows: Attribute B is said to be full functionally dependent on attribute A if B is functionally
dependent on A and not functionally dependent on any subset of A. Such a definition is
incorrect as the authors fail to distinguish the difference between proper subset and subset.
Any set is a subset of itself. A proper subset of a set is any subset of that set excluding the set
itself.

An A/AS level textbook defines partial dependency as follows: one or more non-key attributes
depend on part of the primary key. This is not entirely correct as the notion of functional
dependencies does not restrict the independent attribute (attribute A) to be a primary key as
described in the book.
Suppose there is a Student_in_Society table storing information about student roles in various
societies and clubs in a school. The table also contains information of the teacher supervisor of
each society. The table has the following attributes (field name in parentheses): student_ID
(StdID), student_name (StdName), society_ID (SocietyID), society_name (SocName),
student_role_in_society (Position), society_teacher_ID (SupID), and society_teacher_name
(Supervisor). Given the fact that each society has exactly one society teacher to give the society
advice, the primary key of the table is a composite key composed by student_ID and society_ID.
Figure 19 shows the full functionally dependency among the attributes in the table.
41
Figure 19. Full functionally dependency among attributes
in the Student_in_Society table.
First Normal Form
If every attribute of the relation is atomic, then the relation is said to be in first normal form (1NF).
An attribute is atomic if it is not multi-valued, i.e. without repeating groups. A table which is not
in 1NF is in unnormalized form (UNF).
The Student_in_Society table below is in UNF as SocietyID is a multi-valued attribute.
StdID
StdName
SocietyID
SocName
SupID
Supervisor
Position
042123 May Wong
001
003
Chinese
Maths
1
2
Mr. Wong
Ms. Chan
Chairman
Member
042132 Katie Lee
001
Chinese
1
Mr. Wong
Member
042142 June Chan
002
005
008
English
Physics
Biology
1
3
4
Mr. Wong
Mr. Lee
Miss Yu
Member
Chairman
Member
Figure 20. The Student_in_Society table in UNF.
The usual way to modify a table in UNF to 1NF is to store the details of the repeating groups in a
separate table. This will result in the following table structures.
Student(StdID,StdName)
Student_in_Society(StdID, SocietyID, SocietyName, SupID, Superviser, Position)
The tables with data are shown in Figure 21.
Student table
StdID
StdName
042123
May Wong
042132
Katie Lee
042142
June Chan
42
Student_in_Society table
StdID
SocietyID
SocName
SupID
Supervisor
Position
042123
001
Chinese
1
Mr. Wong
Chairman
042123
003
Maths
2
Ms. Chan
Member
042132
001
Chinese
1
Mr. Wong
Member
042142
002
English
1
Mr. Wong
Member
042142
005
Physics
3
Mr. Lee
Chairman
042142
008
Biology
4
Miss Yu
Member
Figure 21. The Student and Student_in_Society tables in 1NF.
It is a bad idea to store the multi-valued data in the following table structure.
Student_in_Societies(StdID, StdName, SocietyID1, SocietyName1, SupID1, Superviser1, Position1,
SocietyID2, SocietyName2, SupID2, Superviser2, Position2, SocietyID3, SocietyName3,
SupID3, Superviser3, Position3)
The table above cannot accurately represent the relationship in the real world because a student
should not be restricted to join three societies only. Allowing a student to join the fourth society
implies a modification of the table structure, which can be troublesome once data have been entered
in the table. Anyway the table is not in the 1NF.
Note that many data anomalies cannot be removed by normalizing tables to 1NF. For example, if
Mr. Kwan replaces Mr. Wong to become the society teacher of the Chinese Society, two rows in the
Student_in_Society table in Figure 21 need to be updated (i.e., modification anomaly). It also
suffers from insertion anomaly as we cannot store information about a new society as no students
have joined it. Deletion anomaly exists when the last member of a society quits. The society
information will then be permanently removed from the database.
Second Normal Form
A table is in the second normal form (2NF) if


it is in 1NF, and
it exhibits no partial dependencies, i.e., every non-key attribute in the table is full functionally
dependent on the primary key of the table.
If a table is in 1NF but not in 2NF, it must have a composite primary key according to the second
property of the 2NF. To “promote” a table from 1NF to 2NF, we need to remove the partial
dependencies in the table.
43
Let us further work on the Student_in_Society table in Figure 21 to illustrate the notion of 2NF.
We illustrate that the functional dependencies for the student table are as follows:
StdID, SocietyID Position
SocietyID  SocName
(Full functionally dependency)
(Partial dependency as SocietyID is a part of the
primary key only)
(Partial dependency as SocietyID is a part of the
primary key only)
SocietyID  SupID
We can reconstruct a table in 1NF to 2NF by extracting those fields that exhibit partial dependency
in the table to one or more separate tables. In our example, the Student_in_Society table can be
made conform to 2NF by extracting SocietyName, SupID, Superviser to a separate table, say the
Society table. The attribute that the three extracted fields full functionally dependent on, i.e.,
SocietyID, will be copied to the Society table to serve as the table’s primary key.
structures are:
The new table
Student(StdID,StdName)
Society(SocietyID, SocietyName, SupID, Superviser)
Student_in_Society(StdID, SocietyID, Position)
The tables in 2NF with their data are shown in Figure 22.
Student table
StdID
StdName
042123
May Wong
042132
Katie Lee
042142
June Chan
Society table
SocietyID
SocName
SupID
Supervisor
001
Chinese
1
Mr. Wong
002
English
1
Mr. Wong
003
Mathematics
2
Ms. Chan
005
Physics
3
Mr. Lee
008
Biology
4
Miss Yu
44
Student_in_Society table (revised)
StdID
SocietyID
Position
042123
001
Chairman
042123
003
Member
042132
001
Member
042142
002
Member
042142
005
Chairman
042142
008
Member
Figure 22. The Student, Society and Student_in_Society (revised) tables in 2NF.
Tables in 2NF are not able to solve all data anomalies either. Although the insertion and deletion
anomalies associated with the Student_in_Society table (in 1NF) have gone, the modification
anomaly still exists in the Society table in Figure 22. Suppose Mr. Wong resigns and a new
teacher, Mr. Kwan, will replace Mr. Wong to become the society teacher of all societies that Mr.
Wong used to be responsible for. Note that Mr. Kwan will use the same SupID as Mr. Wong does.
To reflect such a change in the Society table, two rows (instead of one) need to be updated.
Third Normal Form
A table is in 3NF if:


it is in 2NF, and
it exhibits no transitive dependencies
Transitive dependency exists if one or more attributes are functionally dependent on some non-key
attribute(s). If there are three attributes in a table called A, B and C such that A  B and B  C.
Obviously A  C and the attribute C is transitively dependent on A. In the Society table of our
example, SocietyID  SupID and SupID  Supervisor and thus SocietyID  Supervisor which is
a kind of transitive dependency. To convert a table in 2NF to 3NF, attributes that contribute to
transitive dependencies are extracted to separate table(s). The Society table can be made conform
to 3NF by extracting Supervisor to a new table, says the Society_Teacher table. The attribute that
the Supervisor field full functionally dependent on, i.e., SupID, is copied to the Society_Teacher
table to serve as the table’s primary key. This will result in the following table structures.
Student(StdID,StdName)
Society(SocietyID, SocietyName, SupID)
Student_in_Society(StdID, SocietyID, Position)
Society_Teacher(SupID, Superviser)
The tables in 3NF with their data are shown in Figure 23.
45
Student table
StdID
StdName
042123
May Wong
042132
Katie Lee
042142
June Chan
Society table (revised)
SocietyID
SocName
SupID
001
Chinese
1
002
English
1
003
Mathematics
2
005
Physics
3
008
Biology
4
Student_in_Society table
StdID
SocietyID
Position
042123
001
Chairman
042123
003
Member
042132
001
Member
042142
002
Member
042142
005
Chairman
042142
008
Member
Society_Teacher table
SupID
Supervisor
1
Mr. Wong
2
Ms. Chan
3
Mr. Lee
4
Miss Yu
Figure 23. The Student, Society (revised), Student_in_Society and
Society_Teacher tables in 3NF.
Figure 24 shows the full series of changes introduced to transform the original data schema (in UNF)
to the final design (in 3NF).
46
Figure 24. How the original design evolved from UNF to 3NF.
Database Design Exercise 01
For the description of the following scenarios, complete the ER diagram.
Scenario description:
1.
In a school, students are allocated to different classes. Each student must be allocated to exactly one
class, and a class is formed by at least 30 students. Each class must be managed by several different
students, namely, prefect, monitor, etc.
STUDENT
Is allocated
to
CLASS
Is managed
by
Is assigned
to
CLASS POST
47
2.
A construction company has over 1000 employees. A client can hire this company to do projects. Usually,
several types of employees are grouped together to finish a project, e.g. a project may require an
accountant, 2 engineers, 1 managers and 1 system analyst. At the same time, an employee may take up
more than one project. Also, to finish a project may require a number of equipments.
Equipment
Employee
Client
Is assigned
to
Hires
Works
Project
Transform the following ER diagram into the database structure. Please show the structure of the database in
the form of
Tablename (keyfield, field1, field2, …)
3.
In a school, a student may be assigned with one or more functional posts, like prefect, monitor, chairman.
A post must be assigned to exactly one student. Complete the following E-R diagram.
Stud_ID
Name
Post_ID
Is
assigned
to
Address
Date_birth
Post_Name
Then, transform the above ER diagram into database structure:
48
4.
For a chain store, it has a number of branches and each branch will have a manager and several staff.
E.g. staff1 and staff2 belong to branchA whereas staff3 and staff4 belong to branchB. The salaries of the
staff are according to the salary points which are according their positions and year of service. E.g. a
manager with 5 years of services will have a salary point 15 which is $25,000 and a junior staff with 2
years will have a salary point 2 which is $6,000. Complete the ER diagram:
Then, write down the database structure:
5.
Which of the following would be multi-valued attributes?
a)
Contact person for a company
b)
Qualification of a teacher
c)
The name of CEO for a company
d)
The contact phone number for a student
e)
The title of a book
f)
Medicine for a patient
g)
The owner of a credit card
h)
The courses taken by an undergraduate
Give two examples by your own:
49
6.
Staff in a trading company A will purchase products from other companies through some sales agents.
a)
Construct the ER diagram if there is just one sale agent for each company and staff from different
departments may contact the same company.
DEPARTMENT
have
STAFF
b)
COMPANY
contact
Construct another ER diagram if there may be more than one sale agent for a company.
DEPARTMENT
have
STAFF
c)
COMPANY
contact
To remove the multi-valued problem, we can transform the ER diagram into
COMPANY
through
CONTACT_LIST
The database structure would now become
COMPANY (Comp_ID, Name, Address)
CONTACT_LIST (Comp_ID, Agent_Name, Phone, Email)
50
7.
Patient takes more than one medicine, and so, the ER diagram will be
take
8.
To remove multi-valued attribute means 1st Normalization. For the following scenario, which attribute will
be multi-valued? How to perform the first normalization by modifying the database structure?
Patient_ID
9.
Name
Date_birth
Medicine_Name
Quantity
Apart from multi-valued attribute problems, we should solve problem of M:N relations, first we will look at
the 1:1 relation:
a)
Assume there is just one class master for every class, so the ER diagram would be
belong
and so, the database structure (database schema) would become
b)
Then, we will look at some 1:M relation:
Assume each employee will belong to a department and a department has to have at least one
employee. Then, the ER diagram will be
51
belong
and so, the database structure (database schema) would become
c)
Last, we will look at some M:N relation:
Assume teacher will teach a number of classes and each class will have several teachers to teach
different subjects, so the ER diagram would be
teach
52
However, since it is a M:N relation, so, it will be transformed into
and so, the database structure (database schema) would become
<End of Database Design Exercise 01>
53
Database Design Exercise 02
1.
Given that the relationship Teaches between entities TEACHER and COURSE is one-to-many. Table
should include a foreign key
2.
A.
TEACHER, course_id
B.
TEACHER, teacher_id
C.
COURSE, teacher_id
D.
COURSE, course_id
.
In transforming into database schema, a multi-valued attribute
A.
will be mapped into a foreign key
B.
will be stored in multiple rows of the same table
C.
will be stored in multiple columns of the same table.
D.
will require creating a new table
Study the paragraph below carefully and answer the following four questions:
In an air freight service company, each customer will request a sales order for a freight. Each sales order is
taken care of by one salesperson. Each salesperson may take care of many sales orders. A sales order is a
freight requested by a customer. Each customer has made a request for at least one freight.
3.
4.
5.
6.
Which of the following tables will Salesperson_id not be found?
A.
CUSTOMER
B.
ORDER
C.
SALESPERSON
D.
none of the above
Which of the following tables will customer_id not be found?
A.
CUSTOMER
B.
ORDER
C.
SALESPERSON
D.
none of the above
Which of the following tables will salesperson_id be used as a foreign key?
A.
CUSTOMER
B.
ORDER
C.
SALESPERSON
D.
none of the above
Which of the following tables will customer_id be used as a foreign key?
A.
CUSTOMER
B.
ORDER
54
7.
C.
SALESPERSON
D.
none of the above
Given that the relationship studies between STUDENT and SUBJECT is many-to-many.
A.
A new table is needed.
B.
The tables should be combined into one.
C.
The relationship studies should be converted into an attribute
D.
A foreign key should be added to a table SUBJECT
1
2
3
4
5
6
7
C
D
A
C
B
B
A
1.
8
9
10
11
12
13
14
15
16
17
18
19
20
The discipline Master of a school wishes to store the late records of students. The following E-R diagram
is drawn.
Stud_id
Name
Date
Commits
STUDENT
X
Parents_
name
Y
Time
LATE
Phone
Reason
Address
If a student is being late for more than 3 times in a semester, a clerk will make a phone call to their
parents. If a student is being late for more than 5 times in a semester, a clerk will send the parents of the
student a letter to notify the problem through the address in the above ER diagram.
a)
By investigating the above diagram, what problem will be suffer?
parents_name may be multi-valued.
b)
For the side of the entity STUDENT, is it optional or mandatory?
It is mandatory.
c)
What is the value of X and Y in the diagram?
X: 1 , Y: M, it is a one-to-many relation
d)
Dissolve the ER diagram and present it in a database schema. Remember to identify the primary
key of tables involved.
STUDENT(stud_id, name, address, phone)
LATE(stud_id, date, time, reason)
55
STUDENT_PARENT (stud_id, parents_name)
2.
The discipline Master of a school wishes to store the late records of students. The following E-R diagram
is drawn.
If a student is being late for more than 3 times in a semester, a clerk will make a phone call to their
parents. If a student is being late for more than 5 times in a semester, a clerk will send the parents of the
student a letter to notify the problem through the address in the above ER diagram.
a)
By investigating the above diagram, what problem will be suffer?
parents_name may be multi-valued.
<End of Database Design Exercise 02>
56
Database Design Exercise 03
M.C.
1.
2.
3.
4.
5.
Which of the following is an example of entity?
A.
“Mr. Cheung”
B.
Teacher “Mr. Cheung”
C.
Teacher
D.
The subject taught by “Mr. Cheung”
A primary key
(1)
can be made up of more than one field
(2)
is always a candidate key
(3)
can have null value
A.
(1) only
B.
(1), (2) only
C.
(1), (3) only
D.
(1), (2) and (3) only
Which of the following is not an appropriate attribute for an entity “golf coach”?
A.
name
B.
sex
C.
charge per hour
D.
booked_date
The field name in a table must
(1)
be unique
(2)
be made up of English letters
(3)
not be the same as the table name
A.
(1) only
B.
(1), (2) only
C.
(1), (3) only
D.
(1), (2) and (3) only
Which of the following should use memo data type?
A.
sex
B.
date of birth
C.
product description
D.
name of student
57
6.
7.
8.
Which of the following is NOT a purpose of creating index?
A.
Carry out sorting with a smaller amount of data
B.
Improve data searching performance
C.
Improve data ordering performance
D.
Improve data updating performance
Which of the following is / are important in maintaining referential integrity?
(1)
Foreign keys
(2)
Set validation rules (constraints)
(3)
Avoid using of derived attributes
A.
(1) only
B.
(1), (2) only
C.
(1), (3) only
D.
(1), (2) and (3) only
Which of the following is / are derived attribute?
(1)
The attribute AverageMark in the table Student
Student (ID, name, EngMark, MathMark, ChineseMark, AverageMark)
(2)
The attribute AverageMark in the table Class
Student (StuID, name, sex, ClassID)
Subject (SubjCode, StuID, Mark)
Class (ClassID, SubjCode, AverageMark)
(3)
The attribute Post in the table ClubMember
ClubMember (ClubID, StuID, Post)
9.
10.
A.
(1), (2) only
B.
(1), (3) only
C.
(2), (3) only
D.
(1), (2) and (3)
What would be the consequence caused by derived attribute?
(1)
Data inconsistency may be resulted when updating.
(2)
Data can be retrieved more efficiently
(3)
Data Security is lowered.
A.
(1), (2) only
B.
(1), (3) only
C.
(2), (3) only
D.
(1), (2) and (3)
What would be used to enhance domain integrity?
58
11.
(1)
Set indexes to an attribute
(2)
Set foreign keys
(3)
Setting validation rules (constraint)
A.
(1) only
B.
(2) only
C.
(3) only
D.
(1), (2) only
What would be used to enhance entity integrity?
(1) Set constraint such that the value of an attribute for a composite primary key cannot be NULL
(2) Set constraint such that the value of an attribute for a non-composite primary key cannot be NULL
(3) Set constraint such that only a particular set of data can be inputted to an attribute.
12.
13.
14.
15.
A.
(1) only
B.
(2) only
C.
(3) only
D.
(1), (2) only
Which of the following about a relational table is NOT true?
A.
Table name is unique
B.
Primary key is unique
C.
Foreign key is unique
D.
Field name is unique
The referential integrity constraint for a field requires
A.
a primary key of a table matches with the foreign key of another table
B.
a foreign key of a table matches with the primary key of another table
C.
a primary key to be non-empty and unique
D.
data come from the same domain
In a relational table,
A.
a field may have multi-values
B.
the sequence of fields is insignificant
C.
the sequence of rows is significant
D.
rows can be duplicated
The advantage of program-data independency is
A.
structure of data can be changed without having to change the application program
B.
the program has no privilege to access the database
C.
structure of data can be known by studying the program codes
D.
low level programming language can be used
59
16.
17.
18.
Data redundancy can be minimized by
A.
entering data only when necessary
B.
using database approach
C.
data validation
D.
using traditional file-processing system
The domain constraints for a field require the field to have
A.
non-duplicating values
B.
non-empty values
C.
the same data type and range
D.
the same values
In database architecture, database can be viewed at view level and
(1) Conceptual level
(2) Physical level
(3) Logical level
19.
20.
A.
(1) only
B.
(2) only
C.
(3) only
D.
(1) and (2) only
The degree of the relationship of “Students borrow books” is
A.
1
B.
2
C.
3
D.
none of the above
Which of the following statements about a relation “audience watch TV programs” is correct?
A.
The existence of the entity “audience” in the relationship “watch” is optional.
B.
The existence of the entity “TV program” in the relationship “watch” is mandatory.
C.
The maximum cardinality of the entity “TV program” is 1
D.
The minimum cardinality of the entity “TV program” is 1
Answers:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
C
B
D
A
C
A
A
A
A
C
D
C
B
B
A
B
C
D
B
A
60
1.
The following database is used to store the students learning portfolio. This portfolio should contain
data for the whole secondary school life, data like in which year the students participate in which club
should be included.
Student (StuID, name, address, HKID, phone, sex, DateBirth)
Club (ClubID, ClubName, TeacherInChargeID)
JoinClub (RecordNumber, StuID, ClubID, Post)
a)
Point out the primary key and the candidate keys of each table.
Primary Key
Candidate keys
Student
StuID
StuID, HKID
Club
ClubID
ClubID
JoinClub
RecordNumber
RecordNumber (Not StuID + ClubID)
b)
In what ways this database schema will not work properly, try to write the SQL statement to
overcome of the above problem.
It is assumed to store the information of a particular year only. If it is used to store
several year data, we have to add a new field called year to both tables Club and
Joinclub.
ALTER TABLE club ADD year char(4)
ALTER TABLE Joinclub ADD year char(4)
2.
Inspect the following database schema, briefly describe some scenarios such that they will not perform
correctly.
(i)
Record (BookCode, StuID, DoB, Returned) where DoB means Date of Borrow.
Lack the field amount, it will not function properly if two books with the same
bookcode that is borrowed by the same student, but in fact, it may happen.
(ii)
SportTrainer (ID, Name, typeofsport, charge, gender)
If the trainer will be able to train more than one type of sports, then, this
database structure will be in problem.
(iii)
For a fitness training center, its database structure is as follows:
course (courseID, courseName, TrainerName, Charge)
enrollment (courseID, memberID, IsPaid)
membership (memberID, memberName, memberSex, memberDoB, expiryDate)
courseDetail (courseID, DateCourse, timeZone) where DateCourse means the dates
to have the course opening.
Some of the trainer will have no information in the database if he or she does
not teach any course in the fitness center.
61
3.
Now, you are the database administrator of a recreation center, you designed a form as shown below
Tai Tai Recreation Center Facility Order Form
Membership ID:
Date to use the facility :
/
/
Facility:
Table Tennis
Badminton
BasketBall
Volleyball
FacilityCode TT01
BN01
BL01
VL01
/ Charge
($20)
($45)
($150)
($100)
Location
 Room 113
 SportsRoom1A  SportsRoom1
 SportsRoom1
 Room 114
 SportsRoom1B  SportsRoom2
 SportsRoom2
 Room 115
 SportsRoom1C
Time to use the facility:
Time zone
Duration
Choose
1
2
3
4
5
6
7
8
9
10
12:00 - 1:00
1:00 - 2:00
2:00 - 3:00
3:00 - 4:00
4:00 - 5:00
5:00 - 6:00
6:00 - 7:00
7:00 - 8:00
8:00 - 9:00
9:00 - 10:00










Signature:
Date:
It is supposed that a member cannot book more than one facility at the same time zone in a particular day. i.e.
A member cannot book a table tennis court and a basketball court at the same time, or he cannot book 2 table
tennis court at the same time but he can book a table tennis court for time zone 3 and 4.
and a database schema as shown below:
Facility (FacilityCode, FacilityName, Location, charge)
Membership (MemID, Name, Sex, DateBirth, address, PhoneNumber)
FacilityRecord (MemID, FacilityCode, DoB, timezone) where DoB means Date of Booking.
Is there any problem in the database design? Briefly describe how to solve it.
There may have several different locations for a particular facility, e.g. three rooms for
TT01, so, it will the attribute Location in the table Facility to have multi-valued. To solve
this problem, you should either create a new table to hold data like FacilityCode and
Location or assign each location a unique FacilityCode for the facility even though they
are the same kind of facility.
Also, the primary key for the table FacilityRecord should be RecordNo + FacilityCode
instead of RecordNo + MemID because it is supposed that each RecordNo should be
ordered by just one Member only and hence MemID is full functionally Dependent to
and hence a new table should be created.
<End of Database Design Exercise 03>
62
Database Design Exercise 04
Question 1:
Consider a relational database with three tables, STUDENT, COURSE and GRADE, as shown below:
STUDENT
S_NO
S_NAME
1025
Mary Wu
3350
Tom Leung
4170
Peter Chow
COURSE
C_CODE
C_NAME
CREDITS
CHEM203
Organic Chemistry II
2
COMF117
Computer Science I
3
MATH001
Mathematics
4
GEOG108
Geography
2
GRADE
StudentID
C_CODE
Score
1025
CHEM203
70
1025
COMF117
75
1025
MATH001
80
3350
COMF117
55
3350
GEOG108
40
4170
GEOG108
75
a)
What is the primary key in table GRADE?
(1 mark)
StudentID + C_Code
<- It is called a composite key. We use composite key because C_Code is multi-valued and hence has to be
extracted into a separate table. The ER diagram in this case is
Student
S_Name
Course
take
M
C_Code
N
S_NO
C_Name
score
credit
Where we should note that score is an attribute of the relation “take”.
(b)
Describe a scenario to illustrate the data integrity problem when deleting a record in one of the tables.
How data integrity problem can be avoid? (2 marks)
1.
When a student leaves the school and corresponding record in STUDENT is
deleted
2.
When a course is cancelled and the corresponding record in COURSE is deleted
To avoid data integrity problem, we may
1.
Delete detail record (GRADE) before deleting master record (Student, COURSE)
2.
Enfore a referential integrity (or foreign key) constraint on the database.
63
Question 2:
A teacher has designed a database, EXAM, to store the final examination results of students as follows:
Field Name
(a)
Field Type
Description
StdNo
Numeric
Unqiue student number
Name
Character
Name of the student
Class
Character
Class of the student
Sex
Character
M = male, F == female
SbjCode
Numeric
Unique Subject Code
Subject
Character
Full Name of the Subject
PassMk
Numeric
Pass Mark of the Subject
Mark
Numeric
Mark of the student in the subject
Explain briefly how this design leads to data redundancy
(2 marks)
If a student takes 2 or more subjects, there will be more than 1 record for the same
student and fields like Name, Class and Sex are stored multiple times.
Similarly, If subjects taken by more than 1 student will have fields like subject like
Subject_and_Mk stored multiple times.
->Now, we should state that attributes name, class, sex are full functionally dependent on the primary key
“StdNo”, however, SbjCode would be multi-valued. So, it is unnormalized form.
To fix the problem of data redundancy, the teacher breaks EXAM into three interlinked tables, which use the
above field names only.
(b)
Complete the new design below and underline the corresponding key field(s). Underline the primary key
in the corresponding table.
(2 marks)
Table
Fields
STUDENT
StdNo, Class, Name, Sex
SUBJECT
SbjCode, Subject, PassMk
EXAM
StdNo, SbjCode, Mark
Question 3:
What is wrong in the following ER diagram?
Inventory
MemberID
Client
MemberName
ProductID
M
PointEarned
buy
N
Product
Category
ProductName
Amount
Price
64
Answer:
The attribute of Amount should not be put in the entity “Product”, it should however, be put in the
relation buy. Also, on the Client side, it should be optional instead of mandatory. i.e.
Inventory
Amount
MemberID
Client
MemberName
ProductID
M
buy
N
Product
Category
ProductName
PointEarned
Price
<End of Database Design Exercise 04>
65
Past Paper Investigation:
2000 – AS – CA #1
1. (a) A teacher uses a database file to store the information about his students.
The file has the following structure:
The teacher inputs marks and grades to the database file after each test or examination. At the end of the
school term, he finds some problems in the file design. Identify fields that are redundant and explain why
the fields are redundant
(4 marks)
Totaltest – it is simply the sum of all marks of test1 to 3 and all data in this field can be
obtained from the data in fields of test1 to 3. There is no loss of any information if this
field is deleted. Therefore this field is redundant.
Grade- since this is obtained based on average of all marks in the fields of the
database, as long as the criteria for conversion of marks to grades are the same, no
loss of information is envisaged if this field is deleted. Therefore this field is
redundant.
<- At this level, we should know that the field TotalTest and Grade are redundant, however,
sometimes in the real world, database would have fields that are redundant, the reason for this
is to speed up the data retrieval process. Under such condition, only very frequently used fields
would be created even though it is redundant.
<- Of course, data redundancy would undermine the data integrity, especially referential
integrity.
(b) The teacher would like to add a field that will store the talent of the students (e.g. special skills,
strengths, personal interests, etc.) to the database file. The teacher cannot decide whether the field
should be declared a character type or a memo type. Compare the two data types and recommend
the most suitable data type for the teacher to use.
(4 marks)
Character type of data usually stores information of a certain length that does not
differ greatly. For example, names are stored as character type of length 25. Although
there are names that are short and there are names that are long, they would not be
much longer than 25 characters in length.
However, memo type of data stores information that may vary a lot in their lengths.
66
Memo fields can even include graphics or sounds. For example, talents of students
may be very different among different students. Some students will have fewer talents
and thus will have just one or two words stored in the field while for some other
students with many talents, they will have as much as some paragraphs stored in the
field.
For the above reasons, the memo type of data is recommended for the teacher in
storing the students' talents.
<- Of course, we can use memo type as the data type for field ‘talent’. It may looks like
Name
Talent
Chan Tai Man
Tennis, Piano, C++
Chan Siu Man
Flash, Piano, Violin
Wong Siu Ling
Writing
By using the following SQL,
SELECT name FROM student WHERE UPPER(talent) LIKES “PIANO”
We are able to find the name of the student who is good at piano.
However, since talent is multi-valued, it is recommended to put talent into a new table such
that the field talent would contain just one skill. This is especially important when the skills are
pre-defined. i.e. we can set the value of the field ‘talent’ to be the foreign key which is mapped
to another database table. In that foreign key, we can set the appropriate constraint, e.g. the
value of the field talent should be existed in the parents table.
To illustrate more, lets talk about these two items, students and talents. Originally in the
question, student is regarded as the entity and talent is regarded as attribute. Now, lets think
them as two separate entities and the relation is ‘OWN’, i.e. STUDENTS OWN TALENTS.
Both of them are optional.
STUDENT
OWN
N
TALENT
M
Note: We should always be careful about the case like what would happened when some
students have no particular skills, i.e. null in the field ‘talent’ for some students, or, some
student just do not appear in the table ‘STUDENT’.
67
Contents
Introduction to Structured Query Language ..........................................................70
What is SQL? .........................................................................................................70
History for SQL (not within the curricula) .........................................................70
Data Definition Language and Data Manipulation Language ..........................72
An Illustrative Example – A Library System......................................................73
Commonly Used Data Types in SQL....................................................................76
SQL Statements......................................................................................................77
Creating Database Objects..................................................................................77
Create a database ...........................................................................................77
Create a table in a database ..........................................................................77
Creating Table with Integrity Rule.....................................................................78
Create table with primary key......................................................................78
Create table with foreign key........................................................................79
Modifying Table Structure..................................................................................80
Add column.....................................................................................................81
Drop column ...................................................................................................81
Change columns’ data type ...........................................................................81
Change column(s) to NOT NULL ................................................................82
Add a primary key to an existing table ........................................................83
Deleting Database Objects ..................................................................................84
Delete a table...................................................................................................84
Delete a database............................................................................................84
Adding Data to Tables.........................................................................................84
Insert new row................................................................................................84
Insert new record with only specified column field(s) ................................85
Retrieving Data from Database Table(s)............................................................86
Retrieve all fields from a table ......................................................................86
Retrieve value(s) from particular column(s) of a table ..............................87
Retrieve value(s) from particular column(s) of a table without duplication87
Retrieve data with specified selection criteria.............................................88
Creating and Deleting Data View.......................................................................89
Create a data view..........................................................................................89
Delete a data view...........................................................................................90
Update the value in a column........................................................................90
Update values in a number of columns ........................................................91
Delete record(s) from the table .....................................................................92
68
Result Presentation .............................................................................................93
The ORDER BY clause .................................................................................93
The GROUP BY … HAVING clause ...........................................................94
Operators Used with WHERE ............................................................................96
The LIKE operator ........................................................................................96
The IN operator..............................................................................................98
The BETWEEN Operator.............................................................................98
The AND Operator ........................................................................................99
The OR operator ..........................................................................................100
Add alias to a column...................................................................................101
Joining Tables ...................................................................................................101
Equijoin.........................................................................................................101
The NATURAL JOIN operator..................................................................103
The INNER JOIN operator.........................................................................104
The LEFT (OUTER) JOIN operator .........................................................105
The RIGHT (OUTER) JOIN operator ......................................................105
The FULL (OUTER) JOIN operator .........................................................106
Combining Query Results.................................................................................107
The UNION operator...................................................................................107
The INTERSECT operator .........................................................................108
The EXCEPT/MINUS operator .................................................................109
Using nested SELECT statement................................................................110
Arithmetic Operators/Functions.......................................................................111
String Functions ...............................................................................................112
Aggregate Functions.........................................................................................113
The AVG function........................................................................................113
The COUNT function ..................................................................................114
The MAX function .......................................................................................115
The MIN function.........................................................................................116
The SUM Function.......................................................................................117
Create/Drop Table Index.............................................................................117
Exporting Data from MS Access ......................................................................119
Export Data from an MS Access Database to Another Access Database119
Export Data from an MS Access Database in other file formats.............120
69
Introduction to Structured Query Language
What is SQL?
Structural Query Language (SQL) is a standard language for manipulating and querying database
objects (e.g., table structures and contents) in the relational database management system.
For
simplicity, we refer relational database management system to as database from now on. SQL
allows you to access a database. SQL can be used to define database table structure and to store,
select and manage data from the database including data insertion, update and deletion. SQL is
widely used in databases like MySQL, DB2, Oracle, PostgreSQL, Sybase, Microsoft SQL Server,
MS Access, etc.
History for SQL (not within the curricula)
In early 1970s, a seminal paper related to the relational database model authored by E.F. Codd
received in a considerable notice from the database community. The relational database model
provided a perfectly theoretical framework for the development of a well-formed querying language
that the model could support. By 1974, IBM had defined a language called the ‘Structured English
Query Language’ or SEQUEL. The name was later shortened as Structured Query Language (SQL).
In 1986, a standard for Structured Query Language (SQL) was defined by the American National
Standards Institute (ANSI), and this became an international standard recognized by the
International Standards Organization (ISO) in 1987. In 1989, a revised standard known commonly
as SQL89 or SQL1, was published. The ANSI committee released the SQL92 standard in 1992
(also called SQL2). This standard addressed several weaknesses in SQL89 and set forth conceptual
SQL features which at that time exceeded the capabilities of any existing RDBMS implementation.
The SQL92 standard was approximately six times the length of its predecessor. Because of this
disparity, the authors defined three levels of SQL92 compliance: Entry-level conformance,
Intermediate-level conformance, and Full conformance. Some information about the difference
among various levels of SQL92 compliance can be found here.
In 1999, the ANSI/ISO released the SQL99 standard (also called SQL3). This standard addresses
some of the more advanced areas of modern SQL systems, such as object-relational database
concepts, call level interfaces, and integrity management. SQL99 replaces the SQL92 levels of
compliance with its own degrees of conformance: Core SQL99 and Enhanced SQL99. A short
article that highlights some important changes in SQL99 can be found here.
Although various databases may implement their SQL slightly differently, they support the same
major functions (such as SELECT, UPDATE, DELETE, INSERT, WHERE, etc.) in a similar way
70
in order to fulfill the ANSI standard. This SQL statements introduced in this note are largely
based on the Entry-level conformance of SQL92.
Teaching remarks

Apparently, the SQL statements that the A/AS level curricula cover are so basic that even the
entry-level of SQL92 supports them.

Most of the SQL statements included in this note have been tested on Microsoft Access 2003.
It supports SQL92 but this requires some reconfiguration. The default database format is
Access 2000 which is not compatible with SQL92. To change the default database format,
start Access 2003. Click Tools, then Options. Click the Advanced tab and change the
Default File Format to “MS Access2002-2003” (see Figure 1). To change the SQL syntax to
SQL92, click Tools and then Options. Click the Tables/Queries tab and check both boxes
(This database and Default for new databases) under SQL Server Compatible Syntax
(ANSI 92) (see Figure 2).

It appears that the SQL92 supported by Access 2003 conforms to the entry-level only. For
example, it does not support for some join features such as NATURAL JOIN and FULL
OUTER JOIN. Other non-support features include EXCEPT and INTERSECT, etc.

A subset of the SQL92 standard that is both usable and commonly supported can be found at
http://www.firstsql.com/tutor.htm.
Figure 1. Setting Access 2003’s default database format to
“Access 2002 – 2003” to support SQL92.
71
Figure 2. Setting Access 2003’s default SQL syntax to conform to SQL92.
Data Definition Language and Data Manipulation Language
SQL supports functions such as building and manipulating database objects, populating database
tables with data, updating existing data in tables, deleting data, performing database queries,
controlling database access and overall database administration. Such functions can be classified
into a number of categories and the most well known two categories are Data Definition Language
(DLL) and Data Manipulation Language (DML).
DDL allows user to create and restructure database objects, such as creating and deleting database
tables. Besides, DDL can be used to define table indexes as well as foreign keys between tables.
Some of the commonly used DDL commands are:








CREATE TABLE
ALTER TABLE
DROP TABLE
CREATE INDEX
ALTER INDEX
DROP INDEX
CREATE VIEW
DROP VIEW
72
DML allows users to manipulate data within the objects of a database. Some of the commonly used
DML commands are:




SELECT
INSERT INTO
UPDATE
DELETE
In a nutshell, DDL allows database users to define database objects whereas DML allows database
users to retrieve, insert, delete and update data in a database.
An Illustrative Example – A Library System
In order to help readers understand the SQL statements that we are going to introduce, those
statements will be illustrated in a hypothetical library database as far as possible. The tables used
in the simple library database are the Student, Book and LoanRecord tables and their details are
given below. Readers are reminded that the tables and fields kept in the proposed database are far
less than what a real library system requires. We keep the example database simple and yet
adequate for the illustration purposes.
The Student table is used to store basic student information like student ID, name, the class that the
student belongs, and phone number. The data fields of the Student table are as follows:
StdID
Name
Class
OverduePay PhoneNo
0002011
Chan Ming Wai
2C
12.5
21238782
0002012
Wong Wai Ming
2B
30.5
21234456
0002013
Cheung Ka Fai
2C
0
23212321
0002014
Chang Wai Yee
4A
20.5
23213123
0002015
Lee Oi Lam
5C
3
25214123
0002016
Sze Yuk Ki
7B
1.5
26434534
Table 1.
Data in the Student table.
73
Table 2 describes the characteristics of the data fields in the Student table.
Field Name
Description
StdID



Unique Student number
Text string – 7 digits
Not null (i.e., the field is mandatory and a value is to be inserted)
Name



Student Name
Text string – 30 characters
Not null
Class



The class student study
Text string – 2 characters
Not null
PhoneNo


Phone Number
Text string – 8 digits
OverduePay


Overdue Payment
A number with two decimal places (<= 999.99)
Table 2.
Characteristics of the data fields in the Student table.
Teaching remark

Some people may opt to define numeric data like StdID and PhoneNo as integers instead of text
string. The reason why we prefer to represent the fields as text strings is that as the “numbers”
are not used for computation.

Two different data types can be used to define text strings (see next section) and it is important
for teachers to clarify to their student of the key difference between the data types.
The Book table contains the key information about the books in the library.
table are shown in Table 3.
BookID
Title
00000001
Apple Tree
00000002
Bible
00000003
Star Wing
Table 3.
Details of the Book
Type
Data in the Book table.
74
Table 4 describes the characteristics of the data fields in the Book table.
Field Name
Description
BookID



Unique book ID
Text string – 8 digits
Not null
Title



Book Title
Text string – 100 characters
Not null
Type


Book category
Text string – 3 digits
Table 4.
Characteristics of the data fields in the Book table.
The LoanRecord table contains information of the library items on loan (or once on loan). Details of
the Book table are as follows:
LoanRecID StdID
BookID
DateOfBorrow
Status
1
0002012
00000001
20051001
1
2
0002011
00000002
20020112
2
3
0002012
00000003
20031211
2
4
0002013
00000002
20031001
2
5
0002011
00000002
20051018
1
Table 5.
Data in the LoanRecord table.
Table 6 describes the characteristics of the data fields in the LoanRecord table.
Field Name
Description
LoanRecID



Unique loan record ID
Text string – 8 digits
Not null
StdID



Student number
Text string – 7 digits
Not null
BookID



Book ID
Text string – 8 digits
Not null
DateOfBorrow 


Status
Table 6.



Date of the book being borrowed
Date data type
Not null
Loan status (1 – on loan; 2 – returned; 3 – on hold)
Text string – 1 digit
Not null
Characteristics of the data fields in the LoanRecord table.
75
Commonly Used Data Types in SQL
The data type of a data item restricts the values that the data item can take and the operations which
one can perform on that data item. Table 7 gives some of the commonly used data types in SQL.
Data Type
INTEGER or
SMALLINT
TINYINT
Description
INT
Hold integers only. The three types differ in the
minimum and maximum value that they can represent.
DECIMAL(size, decimal)
NUMERIC(size, decimal)
Hold numbers with fractions. The maximum number of
digits is specified by size. The maximum number of
decimal places is specified by decimal.
CHAR(size)
Hold a fixed length text string. The maximum size of
fixed length string is specified by size.
Unused space
is packed with space characters.
VARCHAR(size)
Hold a variable length string. The maximum size of
fixed length string is specified by size. Unused space is
not packed with any characters.
DATE
Date format may be different in various databases but
they all contain calendar date with year, month and day.
Table 7.
Some basic data types used in SQL.
Note that the Boolean data type, which accepts TRUE or FALSE as its value, is not defined in
SQL92, but in SQL99. However databases support the data type even though they are not
conforming to SQL99.
Teaching remarks

A character string stored in a CHAR column is left-justified and padded with trailing blanks to
the length of the column. All the strings stored in a CHAR column have the same length. These
trailing blanks are preserved in query results.

A character string stored in a VARCHAR column has exactly the same length as the source
string or the expression that generated the string (including trailing blanks). Character strings
stored in a VARCHAR column can vary in length.

A character string stored in a VARCHAR column incurs a 2-byte overhead. Do not use this data
type for columns less than 6 bytes long or for columns that store strings of the same length. Use
the CHAR data type instead.
76
SQL Statements
Creating Database Objects
Create a database
The CREATE DATABASE statement can be used to create a database with a specified name.
Syntax
CREATE DATABASE database_name
Example
A database named “library_system” is created with the following statement.
CREATE DATABASE library_system
Teaching remark

Some databases like Microsoft Access may require users to create a database by using their own
user interface instead of within a SQL environment.
Create a table in a database
The CREATE TABLE statement can be used to create a table with a specified name.
Syntax
CREATE TABLE TableName
(
Column1 DataType1,
Column2 DataType2,
.......
)
Full Syntax
Example 1
Create a table called “Teacher” with two columns named “Name” and “Age” respectively.
CREATE TABLE Teacher
(
Name
varchar(30),
Age
int
)
Sample Query - Q1_1_CreateTableTeacher
Want to Try?
77
Result
An empty Teacher table with two fields – Name and Age – is created.
Teaching remark

The Teacher table is not required in the library system example.
another SQL statement which removes database tables.
It is created to demonstrate
Example 2
Create a table called “Book” that contains fields named “BookID”, “Title” and “Type” such that a
value for “BookID” must be entered for each row and its value is unique within the table.
CREATE TABLE Book
(
BookID char(8) NOT NULL UNIQUE,
Title varchar(100),
Type int
)
Sample Query Q1_2_CreateTableBook
Want to Try?
Result
An empty Book table with three fields – BookID, Title and Type – is created.
The BookID field
is mandatory (indicated by “NOT NULL”) and unique (indicated by “UNIQUE”) within the Book
table.
Creating Table with Integrity Rule
Create table with primary key
For each table, it is necessary to have a field or a combination of selected fields such that their
values can be used to identify each table row uniquely. Such an identifier is known as a candidate
key. The concept of candidate key is essential to good database design. The most commonly used
candidate key of a table is typically selected to be the primary key of the table.
The PRIMARY KEY keyword is used to specify the fields in a table that compose the table’s
primary key.
78
Syntax
CREATE TABLE TableName
(
Column1 DataType, NOT NULL
Column2 DataType, NOT NULL
.......
PRIMARY KEY (Column1, Column2, …)
)
Full Syntax
Teaching remark

Technically, all fields in a primary key should be defined to be UNIQUE and NOT NULL.
Although some databases like Microsoft Access 2003 may take all primary key fields as
UNIQUE and NOT NULL even though they are not specified, it is a good practice to specify
them explicitly.
Example
To create a table called “Student” with the primary key “StdID”, we can use the following
statement:
CREATE TABLE Student
(
StdID
char(7) NOT NULL UNIQUE,
Name
varchar(30),
Class
char(10),
Age
smallint,
OverduePay
decimal(5,2),
PRIMARY KEY (StdID)
)
Sample Query Q2_1_createStudent_PriKey
Want to Try?
Teaching remark

The length of the Class field is set to 10 characters long intentionally.
characters long using another SQL statement later.
We will alter it to 2
Create table with foreign key
A foreign key (which may be composite) to another table ensures that the value of the foreign key
field(s) can be found in the primary key of the foreign table. The following example shows how to
create a table in a database with foreign key.
79
Syntax
CREATE TABLE TableName1
(
Column1 DataType1,
Column2 DataType2,
.......
FOREIGN KEY (ColumnX, ColumnY) REFERENCES TableName2
)
Full Syntax
Example
In this example, we would like to create a table “LoanRecord” with a primary key “LoanRecID”
and two foreign keys “StdID” and “BookID” that references tables “Student” and “Book”
respectively by using the following statement.
CREATE TABLE LoanRecord
(
LoanRecID
char(8) NOT NULL,
StdID
char(7) NOT NULL,
BookID
char(8) NOT NULL,
Dateofborrow date,
Status
char(1),
PRIMARY KEY
(LoanRecID),
FOREIGN KEY
(StdID) REFERENCES Student,
FOREIGN KEY
(BOOKID) REFERENCES Book
)
Sample Query Q2_2_CreateLoanRecord
Want to Try?
The SQL script given above for creating LoanRecord table cannot run successfully because a
primary key has not been defined for the Book table created earlier. It is important to rectify the
problem by altering the structure of the Book table before running the above SQL script again.
Important remark

A special view on one or more tables in the database in form of a kind of “virtual” table can be
created with the use of the CREATE VIEW statement. The data stored in the virtual table is
extracted by the SELECT statement.
Both the CREATE VIEW and SELECT statements will
be covered later.
Modifying Table Structure
If required, a table structure can be altered with the use of various forms of the ALTER TABLE
statement.
80
Add column
To add column(s) in a table, use ADD within the ALTER TABLE statement.
Syntax
ALTER TABLE TableName ADD ColumnName DataType
Full Syntax
Example
To add a column named “PhoneNo” in the “Student” table, we can use the following statement.
ALTER TABLE Student ADD PhoneNo char(8)
Sample Query Q3_1_AlterStudenttable
Want to Try?
Result
Drop column
To drop column(s) in a table, use DROP within the ALTER TABLE statement.
Syntax
ALTER TABLE TableName DROP ColumnName
Full Syntax
Example
To drop a column ‘Age’ in the “Student” table, we can use the following statement.
ALTER TABLE Student DROP Age
Sample Query Q3_2_Alterstudent_Drop
Want to Try?
Result
Change columns’ data type
Apart from adding or dropping an existing column(s) in a table, we can also edit the structure or
change the data type as well as characteristics for the existing column(s) by using ALTER
TABLE … ALTER COLUMN statement.
81
Syntax
ALTER TABLE TableName ALTER COLUMN Column1 NewDataType
Full Syntax
Teaching remark

If a new data type is set for an existing column, the values that already exist in the column must
be compatible with the new data type. Otherwise, the query will not be running successfully.
Example
To change the data type ‘Class’ to char(2) in the “Student” table, we can use the following
statement:
ALTER TABLE Student ALTER COLUMN Class char(2)
Sample Query Q3_3_Changedatatype
Want to Try?
Result
Change column(s) to NOT NULL
Syntax
ALTER TABLE TableName ALTER COLUMN Column1 DataType NOT NULL
Full Syntax
Example
To change the data type ‘Name’ to NOT NULL in the “Student” table, we can use the following
statement:
ALTER TABLE Student ALTER COLUMN Name varchar(30) NOT NULL
Sample Query Q3_4_changefieldNotNull
Want to Try?
82
Result
Teaching remark

In the above MS Access 2003 interface, the item “Required” means a mandatory entry.
other words, the value for the field cannot be NULL, i.e., NOT NULL.
In
Add a primary key to an existing table
Apart from creating the primary key when creating table, we can also create a primary key to an
existing table by changing the table’s column property.
Syntax
ALTER TABLE TableName ADD PRIMARY KEY (ColumnName)
Full Syntax
Example
ALTER TABLE Book ADD PRIMARY KEY (BookID)
Sample Query Q3_5_AddPriKey
Want to Try?
Result
Teaching remark

As the Book table has a primary key now, the SQL script for creating the LoanRecord table that
references the Book table can now be running successfully.
83
Deleting Database Objects
If required, a database table or even the whole database can be deleted.
Delete a table
To delete a table, use the DROP TABLE statement.
Syntax
DROP TABLE TableName
Full Syntax
Example
DROP TABLE teacher
Sample Query Q3_6_Droptable
Want to Try?
Delete a database
We can delete the entire database with the use of the DROP DATABASE statement.
Syntax
DROP DATABASE DatabaseName
Example
DROP DATABASE my_database
Teaching remarks


The DROP DATABASE statement should be used very rarely.
You will not be able to run the DROP DATABASE statement within the graphical user
environment of MS ACCESS 2003.
Adding Data to Tables
To insert data into a table, we can use INSERT INTO statement.
insert a specified field into a table.
We can insert a new row or
Insert new row
Syntax
INSERT INTO TableName VALUES
(
Value1,
Value2,
.......
)
Full syntax
84
Value1 is the value of the first field of the TableName table when the table is created.
Value2 is the value of the second field of the table.
Similarly
Example
The following query inserts data into the Student table.
INSERT INTO Student VALUES ('0002011', 'Chan Edward', '1C', 12.5,
'21238782');
INSERT INTO Student VALUES ('0002012', 'Wong Wai Ming', '2B', 30.5,
'21234456');
INSERT INTO Student VALUES ('0002013', 'Cheung Ka Fai', '1C', 0,
'23212321');
INSERT INTO Student VALUES ('0002014', 'Chang Wai Yee', '4A', 20.5,
'23123123');
INSERT INTO Student VALUES ('0002015', 'Lee Oi Lam', '5C', 3, '25214123');
INSERT INTO Student VALUES ('0002016', 'Sze Yuk Ki', '7B', 1.5, '26434534');
Sample Query Q4_1_InsertData – Q4_6_InsertData
Want to Try?
Result
Insert new record with only specified column field(s)
The following statement shows how to insert a new record with specified column field(s).
Syntax
INSERT INTO TableName (Column1, Column2..) VALUES
(
Value1,
Value2,
.......
)
Full syntax
Example
85
We insert the Book ID and titles of three books into the Book table.
information (stored in the Type field) is empty for the three books.
The book category
INSERT INTO Book (BookID, Title) VALUES ('00000001', 'Apple Tree');
INSERT INTO Book (BookID, Title) VALUES ('00000002', 'Bible');
INSERT INTO Book (BookID, Title) VALUES ('00000003', 'Star Wing');
Sample Query Q4_7_InsertSpecialField - Q4_9_InsertSpecialField
Want to Try?
Result
Retrieving Data from Database Table(s)
To select specific data from one or more tables, we can use the SELECT statement. The SELECT
statement can be used in conjunction with other SQL statement to build sophisticated database
queries.
Retrieve all fields from a table
In SQL statement, the symbol “*” is used to represent the “all of them”. The statement can be
used to retrieve data from multiple tables but we defer the discussion to a later stage. We can use
the following statement to select all fields from a database table.
Syntax
SELECT * FROM TableName
Full Syntax
Example
The following SQL statement retrieves (and displays) all records in the Student table.
SELECT * FROM Student
Sample Query Q5_1_select
Want toTry?
Result
86
Retrieve value(s) from particular column(s) of a table
To select data from particular column of a table, we can use the SELECT statement too.
Syntax
SELECT Column1, Column2…
FROM TableName
Full Syntax
Example
To select the ‘Name’ and ‘Class’ columns from the Student table, we can use the statement as
below.
SELECT Name, Class FROM Student
Sample Query Q5_2_SelectSpecificField
Want to Try?
Result
Teaching remark

If the values of the selected columns from different rows of the table are the same, multiple
occurrences of the same values will result. To avoid the duplication, the SELECT DISTINCT
statement is required.
Retrieve value(s) from particular column(s) of a table without duplication
The SELECT DISTINCT statement is used to select the value(s) of those specified column field(s)
with no duplication. The syntax of this statement is as follows.
Syntax
SELECT DISTINCT column1, coloumn2 …
FROM TableName
Full Syntax
Example
In this example, we would like to identify all students who have used the library service at least
once. If we use the SELECT statement without the DISTINCT keyword, multiple occurrences of
the same students may appear if those students use the library services more than once. To avoid
the duplication, we retrieve all distinct value(s) of the ‘StdID’ field from the LoanRecord table (see
Table 5 for its content) with the use of the SELECT DISTINCT statement.
SELECT DISTINCT StdID FROM LoanRecord
Sample Query Q5_3_SelectDistinct
Want to Try?
87
Result
Retrieve data with specified selection criteria
A WHERE clause can be appended to the basic SELECT statement to specify the condition(s) that
the retrieved data need to fulfill. Rows that do not meet the specified condition(s) will not be
retrieved. When more than one condition is specified, AND/OR may be used to join the
conditions.
Syntax
SELECT Column1, Column2… FROM TableName
WHERE Condition(s)
Full Syntax
Common operators used in the WHERE clause are tabulated below.
Operator
Description
=
Equal to
<>
Not Equal to
>
Greater/ Larger than
<
Less/ Smaller than
>=
Greater or equal to
<=
Less or equal to
BETWEEN
Within the range
LIKE
Match the pattern
Example
In this example, we would like to retrieve records of those students in the class “1C”.
SELECT * FROM Student WHERE class = '1C'
Sample Query Q6_1_SelectwithCriteria
Teaching remark

Except for numeric values, the operand(s) of the operator must be enclosed by a pair of single
quotation marks ‘’.
Result
88
Creating and Deleting Data View
Create a data view
With the use of the CREATE VIEW statement, users may create a special view on one or more
tables (or views) in the database in form of a new “virtual” table. The data view is created with
the use of an associated SELECT statement. Most SQL statements that apply to a database table
can also be applied to a data view.
Syntax
CREATE VIEW ViewName (Column1, Column2…)
AS Select-Statement;
Full Syntax
Example
In this example, we would like to create a data view to store the Book ID of those library books that
are currently on loan and their corresponding borrowers (Student ID and Name).
CREATE VIEW BookOnLoan_n_Borrower_View (StdID, Name, BookID)
AS SELECT Student.StdID, Name, BookID
FROM Student, LoanRecord
WHERE Student.StdID = LoanRecord.StdID AND status='1';
Sample Query Q29_Create_View
Want to Try?
Result
A data view known as BookOnLoan_n_Borrower_View is created.
The data view has the following content.
89
Delete a data view
A data view can be deleted with the use of the DROP VIEW statement.
Syntax
DROP VIEW ViewName;
Full syntax
Example
The BookOnLoan_n_Borrower_View data view created earlier can be removed with the following
SQL statement.
DROP VIEW BookOnLoan_n_Borrower_View;
Sample Query Q29_Drop_View
Want to Try?
Result
The BookOnLoan_n_Borrower_View data view is removed.
Updating Data in a Table
Apart from retrieving data from a table, we can also modify selected data in a table by using the
UPDATE statement and delete selected row(s) from a table.
Update the value in a column
To modify values in a selected column of one or more rows, we can use the UPDATE … SET
statement.
Syntax
UPDATE TableName
SET Column = NewValue
WHERE Condition(s)
The WHERE clause is optional.
Full syntax
If the WHERE clause is not used, the value in the specified column of each row will be changed to
the new value.
90
Example
Suppose we had wrongly put ‘1C’ as the value of the ‘Class’ field for students in Class 2C (and no
records for students from Class 1C have been entered), the problem can be rectified by the
following SQL statement.
UPDATE Student SET class = '2C'
WHERE class='1C';
Sample Query Q7_1_updatetable
Want to Try?
Result
Teaching remark

Except for numeric values, the operand(s) of the operator must be enclosed by a pair of single
quotation marks ‘’.
Update values in a number of columns
To modify values in a number of columns, we can use the following statement.
Syntax
UPDATE TableName
SET Column1 = NewValue1, Column2 = NewValue2
WHERE Condition(s)
The WHERE clause is optional.
Full syntax
If the WHERE clause is not used, the values in the specified column(s) of each row will be changed
to the new values.
Example
Suppose we have wrongly entered the name and phone number of a student with student ID equal to
‘0002011’ in the Student table earlier on, we can use the UPDATE statement to fix the problem.
The name and phone number of the student should be “Chan Ming Wai” and ‘21111182’
respectively.
UPDATE Student SET Name = 'Chan Ming Wai', PhoneNo = '21111182'
WHERE StdID='0002011';
Sample Query Q7_2_Updateseveralcolumn
Want to Try?
91
Result
Delete record(s) from the table
To delete record(s) from a table, the DELETE Statement can be used.
Syntax
DELETE FROM TableName
WHERE Condition(s)
The WHERE
clause is optional.
Full syntax
Example 1
In this example, we would like to delete all records with the book ID “00000003” from a table
“Book”.
DELETE *
FROM Book
WHERE BookID='00000003';
Sample Query Q8_1_Deletefield
Want to Try?
Teaching remarks

As the BookID field serves as a foreign key in the LoadRecord table to the Book table and there
is a corresponding record with the BookID equal to ‘00000003’ in the LoadRecord table, the
above DELETE statement cannot be executed successfully. The corresponding rows in the
LoadRecord table need to be removed in order to enable the query to run successfully.
Example 2
To delete all records in the table, we can simply use the DELETE statement without setting any
condition. After running the following SQL statement successfully, the Book table will become
empty.
DELETE FROM Book
Sample Query Q8_2_Deleteall
Want to Try?
Teaching remark

Due to the same reason as indicated in the last “Teaching remark”, the above DELETE
statement cannot be executed successfully unless no corresponding rows in the LoadRecord
table are found.
92
Result Presentation
For various reasons, users may want to organize the result of a query in ascending or descending
order of some selected fields in some occasions. In other occasions, they may be interested in the
value of some aggregated attribute of the retrieved data, e.g., the total number of books that a
student has ever borrowed. The former can be achieved with the use of the ORDER BY clause
whereas the latter can be done with the use of the GROUP BY clause, both in a SELECT statement.
The ORDER BY clause
A query result can be sorted in ascending or descending lexicographical order of one or more
selected sort fields. A lexicographical ordering refers to how characters are ordered in the
corresponding encoding table.
Syntax
SELECT Column(s) FROM TableName
ORDER BY Column1 [ASC|DESC], Column2 [ASC|DESC], ...
Full syntax
Optional parts are put inside square brackets. A vertical bar stands for disjunction.
[ASC|DESC] means that a user may use none of the keywords, or either one.
Thus
Teaching remarks

The sort fields may or may not be selected for retrieval purposes.

The default sorting order is in lexicographical order.
Example 1
Suppose we would like to sort all rows in the Student table in ascending order of the student name.
SELECT * FROM Student
ORDER BY Name
Sample Query Q9_1_sort
Result
Example 2
In the following example, we retrieve all rows in the Student table in ascending order of the Class
field.
93
SELECT * FROM Student
ORDER BY Class ASC
Sample Query Q9_2_SortASC
Result
Example 3
In this example, we would like to sort all records in the Student table in two levels:
descending order of the Class field, then in ascending order of the Name field.
first in
SELECT * FROM Student
ORDER BY Class DESC, Name ASC
Sample Query Q9_2_SortASC2
Result
The GROUP BY … HAVING clause
To facilitate the users to do data analysis, grouping the result in a suitable way is sometimes
required. This can be achieved with the use of the GROUP BY clause.
Syntax
SELECT Column(s) FROM TableName
GROUP BY Column1, Column2, ...
HAVING Condition(s)
The HAVINIG clause is optional.
Full syntax
94
Example 1
To count the number of students who have borrowed books in each class, we can use the GROUP
BY clause (without the HAVING part) as below:
SELECT Class, count(*) AS Num
FROM Student
GROUP BY Class
ORDER BY Class DESC;
Sample Query Q10_Group
The AS keyword enables a user to assign a new label to a selected object. In the above example,
the output of the aggregate function COUNT(*) which counts the number of output rows in each
group (as specified by the GROUP BY clause) is labeled as ‘Num’.
Result
A clause which can only be used after the GROUP BY clause is HAVING.
It comes after
GROUP BY (and before ORDER BY if the clause is needed as well).
The purpose of HAVING
is to set selection criteria based on some aggregate values. The following SQL query counts the
number of students in each of the classes such that its students owe the library more than 20 dollars
overdue fine in aggregate.
Example 2
SELECT Class, Count(*) AS Num
FROM Student
GROUP BY Class
HAVING SUM(OverduePay) > 20
ORDER BY Class DESC;
Sample Query Q10_Group_By-Having
The result of the query is as follows:
95
Teaching remarks

The WHERE clause sets selection criteria for the SELECT statement based on
non-aggregate value(s) only. Any selection based on aggregate value must be
done with the HAVING clause. A common student mistake is to use some
aggregate function(s) in a WHERE clause. Aggregate functions do not work
in a WHERE clause because it is given no information as to how records (i.e.,
table rows) are to be grouped. Such grouping information is provided to the
HAVING clause by the GROUP BY clause.

The SELECT statement can reference values generated by the aggregate functions
or columns specified in the GROUP BY clause only.
SELECT Class, Count(*) AS Num
FROM Student
GROUP BY Class
HAVING SUM(OverduePay) > 20 AND Class > "3"
ORDER BY Class DESC;

The HAVING clause can reference values generated by the aggregate functions or
columns specified in the GROUP BY clause only.

As shown in the above example, the parameter (which is a column) specified in an
aggregate function referred to by the HAVING clause is not needed to be included
as a column referred to by the SELECT statement.
Operators Used with WHERE
A number of operators can be used in conjunction with the WHERE clause to specify the
condition(s) for data retrieval.
The LIKE operator
Earlier on, we learnt to use the SELECT and WHERE statement to select data from one or more
table that meet specified condition(s). Most of those conditions require an exact match.
Sometimes, we may interest to retrieve data based on a partial match. This is supported in SQL by
the LIKE operator. Wildcard characters (‘_’ and ‘%’) are used for specifying a retrieving pattern.
The ‘_’ stands for any character while the
‘%’ means all character combinations (including NULL) are allowed.
Teaching remarks


LIKE can only be used with CHAR and VARCHAR field types.
Unless the SQL-92 syntax is selected in Microsoft Access, the database uses ‘?’ and ‘*’ for ‘_’
and ‘%’ respectively.
96
Syntax
SELECT Column(s) FROM TableName
WHERE Column LIKE pattern
Full Syntax
Example 1
In the following example, all students records with the name started with ‘Ch’ are retrieved.
SELECT *
FROM Student
WHERE Name LIKE 'Ch%';
Sample Query Q11_like1
Want to Try?
Result
Example 2
In this example, we would like to select all students records with the student name’s second letter
being ‘h’ and last letter being ‘i’.
SELECT *
FROM Student
WHERE Name LIKE '_h%i';
Sample Query Q11_like2
Want to Try?
Result
Example 3
The following query selects all students records with at least one ‘u’ character in.the student name.
SELECT *
FROM Student
WHERE Name LIKE '%u%'
Sample Query Q11_like3
Want to Try?
97
Result
The IN operator
When using the WHERE clause, it is possible to use IN to specify a list of values for a selected
column that the SELECT statement requires the retrieved rows to have.
Syntax
SELECT Column(s) FROM TableName
WHERE Column IN (value1,value2,...)
Full Syntax
The value list (which is an operand) of the IN operator can be list explicitly as shown in the above
syntax or generated by another SELECT statement. The latter is known as nested SELECT
statement which will be covered later.
Example
We use the IN operator to select records of student(s) whose name is ‘Cheung Ka Fai’ or ‘Wong
Wai Ming’.
SELECT *
FROM Student
WHERE Name IN ('Cheung Ka Fai','Wong Wai Ming');
Sample Query Q12_in
Want to Try?
Result
The BETWEEN Operator
We can specify a range of values for a selected column using the BETWEEN operator within a
SELECT statement in order to require the corresponding field values of the retrieved rows to be
within the specified value range in an inclusive manner.
98
Syntax
SELECT Column(s) FROM TableName
WHERE Column
BETWEEN value1 AND value2
Full Syntax
Example
We use the BEWTEEN operator to select students with their student ID between 0002013 and
0002015.
SELECT *
FROM Student
WHERE StdID Between '0002013' AND '0002015';
Sample Query Q13_between
Result
Teaching remark

The result of the above query may be different in various databases as some may contain the
boundary records while some may not. However, according to the SQL-92 and SQL-99
standards, boundary records are to be included.
The AND Operator
By using the AND operator, we can require retrieval row(s) of data to meet a number of filtering
conditions simultaneously.
Syntax
SELECT Column FROM TableName
WHERE Condition1 AND Condition2
Full Syntax
Example
The following query retrieve student record(s) such that the student is in class 2C and has overdue
fine to settle.
SELECT * FROM Student
WHERE Class = '2C' AND OverduePay > 0
Sample Query Q14_AND
99
Result
The OR operator
By using the OR operator, we can select data rows such that at least one of its operands (which is a
condition) is fulfilled.
Syntax
SELECT Column FROM TableName
WHERE Condition1 OR Condition2
Full Syntax
Example
The following query retrieves the student records from Student table such that the student is either a
member of Class 2C or his/her name being “Chang Wai Yee”.
SELECT *
FROM Student
WHERE Class='2C' OR Name='Chang Wai Yee';
Sample Query Q15_OR
Result
Example - using both AND and OR Operators in a query
Retrieve the student record of a Class 2C student whose name is “Chan Ming Wai” and the record
of another student whose name is “Chang Wai Yee”
SELECT *
FROM Student
WHERE (Class='2C' AND Name='Chan Ming Wai') OR Name='Chang Wai Yee';
Sample Query Q16_ANDOR
Result
100
Add alias to a column
Sometimes, the column name of the resultant table may not be expressive enough for display
purpose. In this case, we can assign alias to the column of resultant table using the AS operator.
Syntax
SELECT Column1 AS ColumnAlias1, Column2 AS ColumnAlias2,...
FROM TableName
Full Syntax
Example
The following query assigns more meaningful labels to the fields retrieved from the Student table.
SELECT StdID AS Student_ID, Name AS Student_Name, PhoneNo AS Phone_Number
FROM Student;
Sample Query Q17_aliases
Result
Joining Tables
Sometimes, we may need to retrieve data from two or more tables. In this case, we can join tables
with the use of the relevant field(s) of the tables. In most cases, tables are joined according to search
conditions that find only the rows with matching values; this type of join is known as an inner
equijoin. Occasionally, non-equijoins, for example, that express a greater-than or less-than
relationship, may be used. In some other occasions, decision-support analysis may require outer
joins, which retrieve both matching and non-matching rows. The three types of outer joins are left
outer join, right outer join, and full outer join.
Equijoin
We can retrieve data from tables by setting up retrieval condition that requires the column values of
the “joined” tables being equal. In brief, equijoin is a join in which rows from two tables are
combined and added to the result set when there are equal values in the joined columns.
101
Syntax
SELECT TableName1.Column11, TableName1.Column12,...
TableName2.Column21,TableName2.Column22,...
FROM TableName1, TableName2
WHERE equality_condition(s)
Full Syntax
Example 1 (equijoin with repeated column)
In this example, we find details of students and the library service that they have accessed (i.e.,
borrow/return/reserve a book). Output will not include any student details who did not use any
library service before. In order to do so, we retrieve all details from the LoanRecord table and the
Student table where the value of column “StdID” in both tables are equal.
SELECT *
FROM LoanRecord, Student
WHERE LoanRecord.StdID=Student.StdID;
Sample Query Q18_EJoin
Result
In the above example, the “StdID” field occurs twice in the equijoin output as it can be found in
both the LoanRecord and Student tables.
Obviously there is no point in repeating the same piece of information. One of the two identical
columns can be eliminated by changing the SELECT list. The result is called a natural join. More
exactly, the natural join operation produces a Cartesian product of its two argument tables, performs
a selection that enforces equality on attributes that appears in both tables, and removes duplicate
attributes at the end.
Example 2 (natural join)
Suppose we not only want to find the list of students who have accessed library services, but also
the title of books the student borrowed/returned/reserved. To do this, we join all the three tables
with the following query.
SELECT LoanRecord.LoanRecID, Student.Name, Book.Title
FROM LoanRecord, Student, Book
WHERE LoanRecord.StdID=Student.StdID AND LoanRecord.BookID=Book.BookID;
Sample Query Q19_EJoin2
102
By selecting the fields of interest only, the repeated occurrences of the same piece of information
shown in the previous example disappear, i.e. a natural join. In that sense, the natural join is a
subtype of the equijoin.
Result
Teaching remark

In SQL, all join conditions are to be specified explicitly. The fact that two tables have the same
attribute name, (e.g. StdID in the LoadRecord and Student tables), does not mean that a join will
be done between them automatically. Omitting the join conditions when joining tables will
result in an output that corresponds to the Cartesian product of the rows in the selected tables.
The NATURAL JOIN operator
A NATURAL JOIN operation uses the column in both tables that has the same name (and type) to
perform an equijoin. However it relies on the SELECT statement to avoid the retrieval of same
pieces information for implementing the natural join operation.
Syntax
SELECT TableName1.Column11, TableName1.Column12,...
FROM TableName1
NATURAL INNER JOIN TableName2
Full Syntax
Example
In this example, we search the list of students who have at least made use of the library service once,
just like the example showed in the second equijoin example. However, this time we do the same
query with natural join.
SELECT DISTINCT Student.Name
FROM Student
NATURAL INNER JOIN LoanRecord
103
Result
Name
Chan Ming Wai
Cheung Ka Fai
Wong Wai Ming
Note that the multiple occurrences of output records are eliminated with the use of DISTINCT.
Teaching remark

NATURAL JOIN is not supported by Access 2003. However it is easy to model the
NATURAL JOIN operation with the INNER JOIN operation as shown in the next section.
The INNER JOIN operator
Rows of two tables can be joined together by using the INNER JOIN operator when the selected
rows meet some specified condition(s). Rows that fail to meet the conditions will not be selected.
As the prevailing condition type used is the test of equality, INNER JOIN is often used to
implement the concept of equijoin.
Syntax
SELECT Column1, Column2,…
FROM TableName1
INNER JOIN TableName2
ON Condition(s)
Full Syntax
Example
This query below models the NATURAL JOIN example given in the last section.
SELECT distinct Student.Name
FROM Student
INNER JOIN LoanRecord on (Student.stdid = LoanRecord.stdid)
Sample Query Q19_InnerJoin
Resultant Table:
104
The LEFT (OUTER) JOIN operator
The result of a LEFT JOIN operation contains every row from the first table and all matching rows
in the second table. Rows found only in the second table are not displayed. If the rows in the first
table have no match in the second table, fields corresponding to the second tables in the output rows
will be filled with null.
Syntax
SELECT Column1, Column2,...
FROM TableName1
LEFT JOIN TableName2
ON Condition(s)
Full Syntax
Example
To view all library services that the students have accessed as well as those students who have not
made use of the library services at all, we can use the following query.
SELECT Student.StdID, Student.Name, LoanRecord.LoanRecID
FROM Student
LEFT JOIN LoanRecord
ON LoanRecord.StdID=Student.StdID;
Sample Query Q19_LeftJoin
Result
The RIGHT (OUTER) JOIN operator
The RIGHT JOIN will return all the rows contained in the second table and all matching rows in the
first table. If there is no match in the first table, the fields corresponding to the first tables in the
output rows will be given a null.
Syntax
105
SELECT Column1, Column2,...
FROM TableName1
RIGHT JOIN TableName2
ON Condition(s)
Full Syntax
Example
To view the student ID, student name, and the types of library services that the student had made
use of, we may use the following query.
SELECT Student.StdID, Student.Name, LoanRecord.LoanRecID
FROM Student
RIGHT JOIN LoanRecord
ON LoanRecord.StdID=Student.StdID;
Sample Query Q19_RightJoin
Result
The FULL (OUTER) JOIN operator
Unlike LEFT JOIN and RIGHT JOIN which do not include all non-matching rows into the output
table, the result of the FULL (OUTER) JOIN contains those rows that are unique to each table, as
well as those rows that are common to both tables. The fields corresponding to any non-matching
table rows will be given a null in the relevant output rows.
Syntax
SELECT Column1, Column2,...
FROM TableName1
OUTER JOIN TableName2
ON Condition(s)
Full Syntax
Example
We modify the RIGHT JOIN example by replacing the RIGHT JOIN by a FULL JOIN.
106
SELECT Student.StdID, Student.Name, LoanRecord.LoanRecID
FROM Student
FULL JOIN LoanRecord
ON LoanRecord.StdID = Student.StdID
Want to Try?
Result
StdID
Name
LoanRecID
0002011
Chan Ming Wai
2
0002011
Chan Ming Wai
5
0002014
Chang Wai Yee
0002013
Cheung Ka Fai
0002015
Lee Oi Lam
0002016
Sze Yuk Ki
0002012
Wong Wai Ming
1
0002012
Wong Wai Ming
3
4
Teaching remark

FULL (OUTER) JOIN is not supported by Access 2003. However it is easy to model the
FULL (OUTER) JOIN operation by “integrating” the results of the LEFT JOIN operation and
the RIGHT JOIN operation as shown in the next section.
Combining Query Results
Results of two queries can be merged to a single resultant table through UNION and INTERSECT.
The former put all component query results into the resultant table whereas the latter keeps rows
that appear in results of both component queries only. Query results can be removed from another
query results using MINUS. Another way to combine query results together is known as nested
query which is implemented with the use of multiple SELECT statements. In a nested query, the
result of a SELECT statement is used as a part of another SELECT statement.
The UNION operator
It may be useful to merge the results of two queries together to form a single output table. This can
be done with the UNION operator. UNION only works if each query in the statement has the same
number of columns, and each pair of the corresponding columns is of the same type. When using
UNION, all duplicating output rows are eliminated.
Syntax
SQL_Statement1
UNION
SQL_Statement2
107
Example
The following example implements the FULL OUTER JOIN example using LEFT JOIN, RIGHT
JOIN and UNION.
SELECT Student.StdID,Student.Name,LoanRecord.LoanRecID
FROM Student
LEFT JOIN LoanRecord
ON LoanRecord.StdID = Student.StdID
UNION
SELECT Student.StdID,Student.Name,LoanRecord.LoanRecID
FROM Student
RIGHT JOIN LoanRecord
ON LoanRecord.StdID = Student.StdID;
Sample Query Q20_Union
Want to Try?
Result
The INTERSECT operator
The INTERSECT operator returns only those rows that are common to the results returned by two
or more query expressions. INTERSECT only works if each query in the statement has the same
number of columns, and each pair of the corresponding columns is of the same type. When using
INTERSECT, all duplicating output rows are eliminated.
Syntax
SQL_Statement1
INTERSECT
SQL_Statement2
Example
The following query identifies those students whose names have the substrings “Wai” and “Chan”.
SELECT Name
FROM Student
WHERE Name LIKE ‘%Wai%’
INTERSECT
108
SELECT Name
FROM Student
WHERE Name LIKE ‘%Chan%’
Want to Try?
Result
Name
Chan Ming Wai
Chang Wai Yee
Teaching remark

INTERSECT is not supported by Access 2003.
The EXCEPT/MINUS operator
The EXCEPT/MINUS operator returns only those rows that appear in the first query results but not
the second query results. When using EXCEPT/MINUS, all duplicating output rows are
eliminated. EXCEPT is defined in SQL92 whereas MINUS is used by Oracle for the same
purpose.
Syntax
SQL_Statement1
EXCEPT
SQL_Statement2
Example
The following query identifies those students whose names have the substring “Wai” but not
“Chan”.
SELECT Name
FROM Student
WHERE Name LIKE ‘%Wai%’
EXCEPT
SELECT Name
FROM Student
WHERE Name LIKE ‘%Chan%’
Want to Try?
109
Result
Name
Wong Wai Ming
Teaching remark

EXCEPT/MINUS is not supported by Access 2003.
Using nested SELECT statement
Besides using UNION function, we can use the nested SELECT statement to combine query results
in a way that the result of a SELECT statement can be used as the values of some input parameter
for another SELECT statement. For simplicity, we give the syntax of nested SELECT statements
that use the =ANY operator or the IN operator only.
Some other commonly used operators in
nested SELECT statement are >ALL, <ALL, >=ALL, and <=ALL. The latter two are widely used
to find the maximum and minimum values from a list of selected values respectively (see
Example2)
Syntax (=ANY)
SELECT (Column1, Column2, ...)
FROM TableName1,TableName2
WHERE Column =ANY SELECT (Column1, Column2, ...) FROM TableName3
Full Syntax
Note that the =ANY operator can be replaced by the IN operator in the above case.
Example1
In this example, we would like to find students in 2C class who had used some library service(s)
before.
SELECT name
FROM Student
WHERE (StdID =ANY (SELECT StdID FROM LoanRecord)) AND Class = '2C';
Sample Query Q21_NestSelect
Another way to implement the above query is as follows:
SELECT DISTINCT Name
FROM Student, LoanRecord
WHERE Student.StdID = LoanRecord.StdID AND Class = '2C';
To save the effort of referencing the LoanRecord table given earlier, its table content is displayed
again as below.
110
Table “LoanRecord”
LoanRecID
StdID
BookID
DateOfBorrow
Status
1
0002012
00000001
20051001
1
2
0002011
00000002
20020112
2
3
0002012
00000003
20031211
2
4
0002013
00000002
20031001
2
5
0002011
00000002
20051018
1
Result
Example2
In this example, we would like to find the student who owes the large amount of overdue fine to the
library.
SELECT Name, OverduePay
FROM Student
WHERE OverduePay >=ALL (SELECT OverduePay from Student)
Sample Query Q21_NestSelect_2
Result
Arithmetic Operators/Functions
The following arithmetic operators/functions can be used in relevant expressions within a SQL
statement.
Operator/Function
Description
+
The arithmetic add operator or unary plus operator
-
The arithmetic subtract operator or unary minus
operator
*
The multiply operator (a shorthand for all columns
after the SELECT keyword)
/
The arithmetic divide operator
ABS(numeric-expresssion) The ABS() function turns the value of a numeric
expression into its absolute value.
Table 8.
Some SQL arithmetic operators/functions.
111
For example, if the loan period for a library item is 28 days, we can use NOW()+28 to compute the
due date for return if the item is loaned to a library user now.
String Functions
Some SQL-92 string functions are included below.
Function Type
Description
CHAR_LENGTH(string-expression) or
Returns the number of characters in
CHARACTER_LENGTH(string-expression) a string-expression.
for SQL92
LENGTH(string-expression)
for Oracle
Example
The following statement will return
the value 7.
LEN(string-expression)
for Access
SELECT CHAR_LENGTH('library')
LOWER(string-expression)
for SQL92
Converts all letters in a string to
lower case.
LCASE(string-expression)
for Access
Example
The following statement will return
'library'.
SELECT LOWER('Library')
UPPER(string-expression)
for SQL92
Converts all letters in a string to
upper case.
UCASE(string-expression)
for Access
Example
The following statement will return
'LIBRARY'.
SELECT UPPER('Library')
TRIM(string-expression)
for both SQL92 and Access
Removes leading and trailing blanks
from a string.
Example
The following statement will return
the value 9.
112
SELECT CHAR_LENGTH(TRIM('
chocolate ' )
SUBSTRING/SUBSTR(string-expression,
start, length)
for SQL92
MID(string-expression, start, length)
for Access
Returns a substring of a string.
Example
The following statement will return
“library”.
SELECT SUBSTRING('library
system', 1, 7)
Table 9.
Some SQL string functions.
Teaching remark

It appears that many databases introduce their own built-in functions although many of those
functions in fact offer the same functionality as the corresponding SQL-92 built-in functions.
It is important to check carefully before teaching the topic.
Aggregate Functions
Aggregation functions enable the user to perform tasks on more than just one record. They can be
used to perform data calculations, such as maximum, minimum, or average.
Function
Usage
AVG(expression)
Computes the average value of a column by the expression
COUNT(expression)
Counts the rows defined by the expression
COUNT(*)
Counts all rows in the specified table or view
MIN(expression)
Finds the minimum value in a column by the expression
MAX(expression)
Finds the maximum value in a column by the expression
SUM(expression)
Computes the sum of column values by the expression
Table 10.
Some SQL aggregate functions.
The AVG function
The AVG function returns the average value of the selected column.
not be included in the calculation.
Note that any null values will
Syntax
SELECT AVG(Column) FROM TableName
113
Example
We can calculate the average overdue payment per library user by using the AVG
SELECT AVG(OverduePay) AS Average_Overdue_Payment FROM Student
Sample Query Q22 AVG
Result
Note that the overdue payment of Cheung Ka Fai (student ID 0002013) is set to zero and thus the
computation of the average value has included the number. If the field is set to null, the average
overdue payment per user will be 13.6 instead.
The COUNT function
COUNT function is useful when we would like to find the total number of records retrieved subject
to certain selection criteria. There are two kinds of COUNT functions. They are
COUNT(Expression) and COUNT (*). The first counts for the non-null records returned by the
evaluation of the Expression (which is typically a column name). The second returns the total
number of records based on the selection criteria specified in the SELECT statement, no matter the
records are NULL or not.
Syntax - COUNT (column)
SELECT COUNT(Column) FROM TableName
Example
The following query counts the number of inactive library users who have never made use of any
library service.
SELECT Count(*)-Count(LoanRecord.Status) AS Number_of_idle_users
FROM Student
LEFT JOIN LoanRecord
ON Student.StdID = LoanRecord.StdID;
Sample Query Q23_COUNT
Result
Syntax - COUNT(*)
SELECT COUNT(*) FROM TableName
114
Example
The following query counts the number of students whose names start with “Chan”.
SELECT Count(*) AS Number_of_students
FROM Student
WHERE (Student.Name) LIKE "Chan%";
Sample Query Q24_COUNT
Result
The MAX function
The MAX function finds the maximum value in a selected table column.
Syntax
SELECT MAX(Column) FROM TableName
Example
The following query finds the student who owes to the library the largest amount of overdue fine.
SELECT Name, OverduePay
FROM Student
WHERE OverduePay = (SELECT MAX(OverduePay) FROM Student)
Sample Query Q25_MAX
The following SQL script implements the same query without using the MAX function.
SELECT Name, OverduePay
FROM Student
WHERE OverduePay >=ALL (SELECT OverduePay FROM Student)
Result
Teaching remark

Many students may produce a SQL script similar to the one below for the above
query.
SELECT Name, MAX(OverduePay)
115
FROM Student
The above query violates the syntactic rules of SQL. The problems lies on the
fact that a number of student names can be retrieved from the Student table (which
correspond to several rows in the output table) but all aggregate functions like
MAX() returns exactly one row in the output table only.
The MIN function
The MIN function finds the minimum value in a selected table column.
Syntax
SELECT MIN(Column) FROM TableName
Example
The following query finds the student who owes to the library the least amount of overdue fine.
SELECT Name, OverduePay AS Overdue_Fine
FROM Student
WHERE OverduePay = (SELECT MIN(OverduePay) FROM Student)
Sample Query Q26_MIN
Teaching remark

The following SQL script implements the same query without using the MIN
function.
SELECT Name, OverduePay
FROM Student
WHERE OverduePay <=ALL (SELECT OverduePay FROM Student)
Result
116
The SUM Function
The SUM function computes the sum of values from a selected column.
Syntax
SELECT SUM(Column) FROM TableName
Example
The query below gives the total amount of outstanding overdue fine for each class of students.
SELECT Class, SUM(OverduePay) AS Overdue_Fine
FROM Student group by Class
Sample Query Q27_SUM
Result
Miscellaneous SQL Features
Create/Drop Table Index
Indexing is commonly used to enhance the performance of a database system. With the CREATE
INDEX statement, we can create indexing structures (which is a pre-processed list) for database
tables so as to provide an efficient access path to various table rows. When running a query,
database will examine any relevant index for a more efficient data access instead of traversing the
entire table. To delete an index, the DROP INDEX statement can be used.
Syntax (CREATE INDEX)
CREATE INDEX IndexName
ON TableName(Column1,Column2,...)
Full Syntax
Syntax (DROP INDEX)
DROP INDEX IndexName
ON TableName
Example (CREATE INDEX)
In the following, we create an index for the class and student number combination in the Student
table as the two fields are often accessed by various queries.
117
CREATE INDEX ind_class_stdID
ON Student (class, StdID)
Sample Query Q28_CREATE_INDEX
Result
Example (DROP INDEX)
The following delete the index that was created in the previous example.
DROP INDEX ind_class_stdID
ON Student
Sample Query Q28_DROP_INDEX
Result
118
Exporting Data from MS Access
Most databases are equipped with some data export facility so that data within a database can be
“exported” for the use of other applications. Many of them also allow not only export of data, but
also export of table structures, queries and other database objects. Those features enable user to
migrate their data base from one database to another database. In this section, we will briefly
mention the data export facility in Microsoft Access. Specifically it allows its users to export data
as text, HTML or Microsoft Excel format.
Export Data from an MS Access Database to Another Access Database
1.
Open the existing MS database and select the database object that you want to export by
clicking on it.
2.
Click File  Export from the menu bar
119
3.
Enter the file name of another Access database (.mdb) and click the Export button
Export Data from an MS Access Database in other file formats
1.
Open the existing MS database and select the database object that you want to export by
clicking on it.
120
2.
Click File  Export from the menu bar
3.
Change the file format to “TEXT”,”HTML” or “EXCEL”
121
4.
Enter the file name and click the Export button
122
Database Class Practice Activities 01
http://www.yll.edu.hk/~yll-cym/ca/download/database_activity_01.mdb
There are 3 tables in a database, the structures are shown below:
CLUB
Field
Data type
Width
Description
studID
character
20
ID of the students
clubname
character
20
The name of the club
fee
character
1
If the fee is settled, ‘Y’, ‘N’ otherwise
position
character
20
The position of the students in that club
Info
Field
Data type
Width
Description
sex
character
1
Sex of the students
name
character
20
Name of the students
address
character
100
District, e.g. “Yuen Long”, “Tin Shui Wai”
class
character
3
The class, e.g. 1A, 2A
classno
character
2
The class number
studID
character
7
ID of the students
Result
Field
Data type
Width
Description
subj
character
20
The name of subject
mark
integer
/
The mark of the student for that subject
studID
character
7
The ID of the student
Step
1.
Description of the requirement / Corresponding SQL
Create a table called club with the above structure:
CREATE TABLE club
(
studID varchar(20),
clubname varchar(20),
fee char(1),
position varchar(20)
)
2.
Add a new field called “skill” in the table “club” which is a character field with 20 character width.
ALTER TABLE club ADD skill varchar(20)
3.
Reduce all the mark by 10 for the subject “chin” in the table “result”.
UPDATE result SET mark = mark-10 WHERE subj="chin"
123
4.
Reset the table “club” such that all the position is changed to “senior member”.
UPDATE club SET position = 'senior member'
5.
Set the skill to “violin” for the student whose studID is “2006004” in the table club.
UPDATE club SET skill = 'violin' WHERE studID='2006004';
6.
Set the fee to ‘N’ for the clubname = ‘maths’ in the table “club”.
UPDATE club SET fee = 'N' WHERE clubname="maths";
7.
Insert a new record with the following information:
studID
clubname
fee
position
skill
2006004
music
N
member
piano
INSERT INTO club VALUES ('2006004', 'music', 'N', 'member', 'piano');
8.
Insert a new record with the following information:
clubname
studID
position
fee
chem
2006007
member
Y
INSERT INTO club ( clubname, studID, position, fee )
VALUES ('chem', '2006007', 'member', 'Y');
9.
Cancel the field “skill” in the table club.
ALTER TABLE club DROP skill
10.
Create a table called staff with the following structure:
Field
Data type
Width
Description
id
character
5
ID of the staff
name
character
20
Name of the staff
dob
date
/
Date of birth
salary
numeric
6, zero decimal
Salary of the staff
CREATE TABLE staff
(
id char(5),
name varchar(20),
dob date,
salary numeric(6,0)
)
11.
List all the data from the table “info”.
SELECT * FROM info;
12.
List the fields “class” and “name” from the table “info”
124
SELECT class, name FROM info;
13.
In each class, there are students from different district (i.e. the field address). Show the class and
the address with no duplication.
SELECT DISTINCT class, address FROM info
14.
Show the name of the students and his or her class, the address
would have a letter “e” and the name would start with the letter
“M”.
SELECT name, class FROM info
WHERE address LIKE "%e%" AND name LIKE "M%"
15.
Show the studID and his or her mark, select only those mark that
are in the range of 75 and 95.
SELECT studID, mark FROM result WHERE mark BETWEEN 75 AND 95
16.
What does it mean for the following SQL?
SELECT studID, mark FROM result
WHERE subj="Eng" ORDER BY mark;
It will show the student ID (studID) and the mark of the students of the subject
“Eng”, the order of the list is according to the mark in ascending order.
19.
Show the address, class and the name of the
students, the list should be sorted by the address
in descending order, then, sorted by class and
then name accordingly in ascending order, the
right diagram shows the result:
SELECT address, class, name FROM info ORDER BY address DESC , class, name;
20.
For each subject, shows the average mark.
125
SELECT subj, AVG(mark) FROM result GROUP BY subj;
21.
Show the average mark of the student ID (studID) form the table
“result”.
SELECT studID, AVG(mark) FROM result GROUP BY studID;
22.
Show how many 1A students lived in each district (the field
address in the table “info”).
SELECT address, COUNT(*) FROM info WHERE class="1A" GROUP BY address;
23
For each district, show the number of students who are living in
that district. Name the field address as “district”, Count(*) as “cnt”.
SELECT address AS district, COUNT(*) AS cnt FROM info
GROUP BY address HAVING COUNT(*)<5;
26.
For each student, shows his or her average mark.
SELECT studID, AVG(mark) FROM result
GROUP BY studID HAVING AVG(mark)>70;
27.
Show each students’ marks in each subject.
[Use Inner Join]
SELECT info.class, info.name, result.subj, result.mark FROM info, result
126
WHERE info.studID=result.studID;
28.
Show a list of form one girl whose English mark is at least 80.
SELECT info.class, info.name, result.mark
FROM info, result
WHERE info.studID=result.studID
And info.sex='F' And result.mark>=80
And result.subj='Eng' And class Like "1%";
29.
Show a list of students and their number of
subjects in the table “Result”, e.g. Ming Chan
has marks in both the subjects in “Eng” and
“Chin”. So, Ming Chan will have 2 in
number_subj.
SELECT info.class, info.classno, info.name, COUNT(*) AS number_subj
FROM info, result
WHERE info.studID=result.studID
GROUP BY info.class, info.classno, info.name
30.
Show a list of students average marks on class and
subject basis.
SELECT info.class, result.subj, ROUND(AVG(result.mark),2) AS average
FROM info, result
WHERE info.studID=result.studID
GROUP BY info.class, result.subj;
31.
Show a list of students who are failed in those
subjects.
SELECT info.class, info.name, result.subj, result.mark FROM info, result
WHERE result.studID=info.studID And result.mark<50;
127
32.
Show a list of students who have not attend any club in
the school.
[Hint: You should focus on how to find all those studID
that does not appear in the table “club”]
SELECT studID, name FROM info
WHERE studID NOT IN (SELECT studID FROM club);
33.
Show a list of students ID who is both a member of the club “eng” and “chin”
SELECT studID FROM club
WHERE clubname = 'eng' AND studID IN
(SELECT studID FROM club WHERE clubname='chin');
34.
Show a list of students ID who may be a member of ‘eng’ or a member of ‘chin’.
[Note: Try to use the command ‘UNION’ instead of ‘OR’.]
SELECT studID FROM club WHERE clubname = 'eng'
UNION
SELECT studID FROM club WHERE clubname = 'chin';
35.
Show a list of students ID (and their marks) whose mark is
higher than the average in the subject ‘eng’.
SELECT studID, mark FROM result WHERE subj = 'eng' AND mark >
(SELECT AVG(mark) FROM result WHERE subj ='eng');
36.
Show a list of students and the club they attended to by
making use of the command LEFT JOIN.
SELECT info.name, club.clubname
FROM info LEFT JOIN club ON club.studID=info.studID;
128
Database Class Practice Activities 02
http://www.yll.edu.hk/~yll-cym/ca/download/database_activity_02.mdb
There are 5 tables in a database, the structures are shown below:
CLUB
Field
Data type
Width
Description
studID
character
20
ID of the students
clubname
character
20
The name of the club
fee
character
1
If the fee is settled, ‘Y’, ‘N’ otherwise
position
character
20
The position of the students in that club
Info
Field
Data type
Width
Description
sex
character
1
Sex of the students
name
character
20
Name of the students
address
character
100
District, e.g. “Yuen Long”, “Tin Shui Wai”
class
character
3
The class, e.g. 1A, 2A
classno
character
2
The class number
studID
character
7
ID of the students
Result
Field
Data type
Width
Description
subj
character
20
The name of the subject
mark
integer
/
The mark of the student for that subject
studID
character
7
The ID of the student
Teacher
Field
Data type
Width
Description
teaID
character
3
The teacher ID, it is the primary key of the table.
teaName
Var char
20
The name of the teacher
dob
Date
/
Date of birth of the teacher
doa
Date
/
Date of admission to the school
subjTeacher
Field
Data type
Width
Description
class
character
3
The name of the class
subj
Var char
20
The name of the subject
teaID
character
3
The teacher ID
lesson
integer
/
The number of lessons in a cycle for that subj in that class.
129
*It is assumed that a teacher may teach more than one subject for a particular class, also, a lesson can be
taught by more than one teacher.
Step
1.
Description of the requirement / Corresponding SQL
State the SQL statement required to create the table ‘teacher’, you should explicitly state that the
values in the field teaID should be unique but not null.
CREATE TABLE teacher (
teaID char(3) UNIQUE NOT NULL,
teaName varchar(20),
dob date,
doa date,
PRIMARY KEY (teaID)
)
2.
Add the following data to the table ‘teacher’.
Rec No.
teaID
teaName
dob
Doa
1
006
Cheng Yung
1981/5/12
2004/9/1
2
003
Ma Ting
1978/2/5
2003/9/1
3
016
Law Man
1973/1/29
1999/12/5
4
012
Kon Li
1965/12/16
1994/9/1
Write the SQL statement required to add the data rec no = 3 to the table teacher.
INSERT INTO teacher ( teaID, teaName, dob, doa )
VALUES ('016', 'Law Man', #1/29/1973#, #12/5/1999#);
3.
In the table subjTeacher, can “class + teaID” be set as the Primary Key? Why?
Class + teaID should not be set as the primary key, otherwise, one teacher can
teaches a specific class for just one subject only which is not practical.
4.
What should be set as the primary key for the table subjTeacher?
class, subj, teaID
5.
Write down the SQL statement that is needed to create a table which has the primary key stated in
question 4.
CREATE TABLE subjTeacher (
class char(3) NOT NULL,
subj varchar(20) NOT NULL,
teaID char(3) NOT NULL,
lesson integer,
PRIMARY KEY (class, subj, teaID))
6.
Set the foreign key ‘teaID’ in the table ‘subjTeacher’ by referencing the table ‘Teacher’. What is the
advantage of using a foreign key?
130
ALTER TABLE subjTeacher ADD FOREIGN KEY (teaID)
REFERENCES teacher(teaID)
Foreign key is useful in ensuring data integrity or more specific referential
integrity.
7.
Show a list of teachers and the total number of lessons of that
teachers.
SELECT teacher.teaName, SUM(subjTeacher.lesson) AS total_lesson
FROM teacher LEFT JOIN subjTeacher ON subjTeacher.teaID=teacher.teaID
GROUP BY teacher.teaName;
8.
Show a list of class, teacher and the number
of subjects taught by this teacher.
SELECT subjTeacher.class, teacher.teaName, COUNT(*) AS number_subject
FROM teacher INNER JOIN subjTeacher ON subjTeacher.teaID=teacher.teaID
GROUP BY subjTeacher.class, teacher.teaName
HAVING COUNT(*)>1;
9.
Show the name of English teachers.
SELECT DISTINCT teacher.teaName
FROM teacher INNER JOIN subjTeacher ON subjTeacher.teaID=teacher.teaID
WHERE subjTeacher.subj='eng';
10.
Show the name of the teachers who is young than 30.
SELECT teaName, ROUND((date()-dob)/365,1) AS age
FROM teacher
WHERE (date()-dob)/365<30;
11.
Show the list of teachers who has not yet assigned any lessons.
SELECT teaName
FROM teacher
WHERE teaID NOT IN (SELECT DISTINCT teaID FROM subjTeacher)
12.
a)
Create a view named “view1” which will include the class, classno, name and mark for the
subject ‘Chinese’ for 1A students.
131
CREATE VIEW view1 AS SELECT DISTINCT name FROM info
12.
b)
Create a view called “view2” such that it will hold the information from the table info about the
students from the class ‘1A’. Here, you should note that what will happen if the data in the view
is being modified. Also, give reason why this approach is being used.
The data in the source table is being modified too. This approach is being used
because when a view is opened to a particular users, if that users is permitted
to make modifications, then, the modifications should not be limited to the view
but also the original source table.
13.
Create a index named “index1” which is according to class and classno on the table info.
CREATE INDEX index1 ON info (class, classno)
14.
Add a constraint named “cons1” to the table info such that the class and classno would be unique.
ALTER TABLE info ADD CONSTRAINT cons1
UNIQUE (class, classno)
15.
Add a constraint named “cons2” to the table result such that the mark has to be less than 100.
ALTER TABLE result ADD CONSTRAINT cons2
CHECK (mark < 100)
16.
Add a constraint named “cons3” to the table info such that the first 2 characters of the studid would
have to be “20.
ALTER TABLE info ADD CONSTRAINT cons3
CHECK (LEFT(studid,2) = '20')
17.
Create a new table called 1A_result. Then, insert those data from the table result about the 1A
students.
We should use two-step query:
CREATE TABLE 1A_result (subj char(20), mark int, studID char(10))
INSERT INTO 1A_result
SELECT * FROM result WHERE studID IN
(SELECT studID FROM info WHERE class = '1A');
OR in Oracle, we can use this statement:
CREATE TABLE table1 AS
SELECT * FROM result WHERE studID IN (SELECT studID FROM info WHERE class=’1A’
132
More on Join
A cross join is a specialized inner join. It does the same thing as the inner join, but it does not have a WHERE
clause, making it the Cartesian product of the tables you are comparing. Thus, the cross join query could look
like this:
SELECT * FROM Actor, Movie
ActorID
Actor.MovieID
Movie.MovieID
Name
Title
Year
1
22
21
Tom Hanks
A Beautiful Mind
2002
1
22
22
Tom Hanks
Forrest Gump
1994
1
22
23
Tom Hanks
The English Patient
1999
2
21
21
Russell Crowe
A Beautiful Mind
2002
2
21
22
Russell Crowe
Forrest Gump
1994
2
21
23
Russell Crowe
The English Patient
1999
3
23
21
Ralph Fiennes
A Beautiful Mind
2002
3
23
22
Ralph Fiennes
Forrest Gump
1994
3
23
23
Ralph Fiennes
The English Patient
1999
So, what would happen if the SQL is changed into
SELECT * FROM Actor, Movie WHERE actor.movieID = movie.movieID
133
SQL Exercise 01:
1
The staff information of a company is stored in a table with the following structure:
STAFF
Field Name
Type
Width
Dec
ID
Character
5
ID of a staff, it is the primary key and will
never null
name
Character
20
name of a staff
salary
numeric
9
dob
date
/
2
Description
salary of the staff
Date of birth
State the SQL needed to create this table.
CREATE TABLE staff
(
id
char(5) UNIQUE NOT NULL
name
char(20)
salary
numeric(9,2)
dob
date
,
,
,
,
PRIMARY KEY
(id)
)
2
The staff information of a company is stored in a table with the following structure:
Branch
Field Name
Type
Width
Description
staffID
Character
5
ID of a staff, it will not be null
branchID
Character
3
Branch ID, it should be a foreign key to another table
called “branch”, it will not be null
State the SQL needed to create this table.
CREATE TABLE branch
(
staffID
char(5) NOT NULL
,
branchID char(3) NOT NULL
FOREIGN KEY
,
(branchID)
REFERENCE branch
)
3
The staff information of a company is stored in a table with the following structure:
Branch
Field Name
Type
Width
Description
staffID
Character
5
ID of a staff, it will not be null
branchID
Character
3
Branch ID, it should be a foreign key to another table
called “branch”, it will not be null
State the SQL needed to add a primary staffID and branchID as the primary key for the table.
134
ALTER TABLE branch
(
(branchID, staffID)
ADD PRIMARY KEY
)
4
The staff information of a company is stored in a table with the following structure:
Branch
Field Name
Type
Width
Description
staffID
Character
5
ID of a staff, it will not be null
branchID
Character
3
Branch ID, it should be a foreign key to another table
called “branch”, it will not be null
State the SQL needed to add a constraint called “staff_length” to check whether the the length of
staffID = 5 or not.
ALTER TABLE branch
(
ADD CONSTRAINT
staff_length
CHECK
(len(staffID)=5)
)
5
Insert records with the following data to a table called “Member”
id
0013 Peter
name
class
1A
sex
M
dob
05/14/92
(id, name, class, sex, dob)
INSERT INTO member
(
VALUES
(‘0013’, ‘Peter’, ‘1A’, ‘M’,
{05/14/92}
)
)
6
In the table “result”, add 5 marks to each record in the field “eng”.
SET
UPDATE result
7
Delete the records of the table “student” with the field “class” = 2B
DELETE FROM
8
eng = eng + 5
student
WHERE class = ‘2B’
Drop a table “result” in the database “DB1”
DROP TABLE result
9
Drop the field “mark” in the table “student”
ALTER TABLE DROP mark
<End of SQL Exercise 01>
135
SQL Exercise 02:
Consider the following database file student.dbf to store the information of students:
STUDENT
1.
a)
field
type
width
contents
id
numeric
4
student id number
name
character
10
name
dob
date
8
date of birth
sex
character
1
sex: M / F
class
character
2
class
hcode
character
1
house code: R, Y, B, G
dcode
character
3
district code
remission
logical
1
fee remission
mtest
numeric
2
Math test score
List all the 2A students
SELECT * FROM student WHERE class="2A"
b)
List the names and Math test scores of the 1B boys.
SELECT class, mtest FROM student WHERE class="1B" AND sex="M"
c)
List all the 2B boys who were born on Monday.
SELECT name FROM student WHERE class="2B" AND sex="M" AND DOW(dob)=2
[In access, DOW is not supported and it is out of syllabus.]
2.
a)
List the classes, names of students whose names contain the letter "e" as the third letter.
SELECT class, name FROM student WHERE name LIKE "_ _e%"
b)
List the classes, names of students whose names start with "T" and do not contain "y".
SELECT class, name FROM student WHERE name LIKE "T%" AND name NOT LIKE "%y%"
c)
List the names of 1A students whose Math test score is not 51, 61, 71, 81, or 91.
SELECT class, name, mtest FROM student WHERE class="1A" AND mtest NOT IN (51, 61,
71, 81, 91)
d)
List the students who were born between 22 March 86 and 21 April 86
SELECT class, name, dob FROM STUDENT WHERE dob BETWEEN {03/22/86} AND
{04/21/86}
3.
a)
Find the number of girls living in Tsim Sha Tsui (TST).
SELECT COUNT(*) FROM student WHERE sex="F" AND hcode="TST"
136
b)
List the number of pass in the Math test of each class. (passing mark = 50)
SELECT class, COUNT(*) FROM student WHERE mtest >= 50 GROUP BY class
c)
List the number of girls grouped by each class
SELECT class, COUNT(*) FROM student WHERE sex="F" GROUP BY class
d)
List the number of girls grouped by the year of birth.
SELECT YEAR(dob) as ydob, COUNT(*) FROM student WHERE sex="F" GROUP BY ydob
e)
Find the average age of Form 1 boys.
SELECT AVG((DATE( )-dob)/365) FROM student WHERE class LIKE "1_" AND sex="M"
4.
a)
List the students with fee remission, in the order of their classes and names.
SELECT class, name FROM student WHERE remission ORDER BY class, name
b)
The range of the Math test of a group of students is defined as:
Range = Maximum – Minimum.
List the range of the girls of each class.
SELECT class, MAX(mtest)–MIN(mtest) FROM student WHERE sex="F" GROUP BY class
c)
The controlled average (CAVG) of the Math test of a group of students is the average score from
which the highest and the lowest scores are excluded (ie. only n–2 out of n data are used).
List the CAVG of the Form 1 boys of each house.
SELECT hcode, (SUM(mtest)–MAX(mtest)–MIN(mtest)) / (COUNT(*)–2) FROM student
WHERE sex="M" GROUP BY hcode
5.
a)
Create a view with the name view1 that contains the names and dob of the students, order in the
ascending order of the dob.
CREATE VIEW view1 AS SELECT name, dob FROM student ORDER BY dob
b)
List the name, class and Math test score of the students whose score is at least 10 marks greater
than the average score of his / her class.
CREATE VIEW view1 AS SELECT class, AVG(mtest)+10 AS mark FROM student GROUP
BY class
SELECT s.name, s.class, s.mtest FROM student s, view1 v WHERE s.class = v.class AND
s.mtest > v.mark
137
The files phy.dbf, chem.dbf, bio.dbf are respectively the data files of the Physics Club, Chemistry Club and
Biology Club.
PHY / CHEM / BIO
6.
field
type
width
contents
id
numeric
4
student id number
name
character
10
name
sex
character
1
sex: M / F
class
character
2
class
a)
List the students who are common members of the Physics Club and the Chemistry Club.
SELECT * FROM phy WHERE id IN (SELECT id FROM chem)
b)
List the students who are common members of the Chemistry Club and Biology Club but not of
the Physics Club.
SELECT * FROM chem WHERE id IN (SELECT id FROM bio) AND id NOT IN (SELECT id
FROM phy)
Consider the following swim.dbf which contains the information of Form 1 students participating in the
Swimming Gala. [and also student.dbf]
SWIM
7.
a)
field
type
width
contents
id
numeric
4
student id number
event
character
20
event
Print a list of 1A students taking part in the Swimming Gala, ordered by their names. The list
should also contain the events.
SELECT s.name, w.event FROM student s, swim w WHERE s.id=w.id AND s.class="1A"
ORDER BY 1 TO PRINTER
b)
List the Blue House members taking part in Free Style events.
SELECT DISTINCT s.class, s.name FROM student s, swim w WHERE s.id=w.id AND
s.hcode="B" AND w.event LIKE "%Free Style"
c)
List the number of students taking part in each event.
SELECT w.event, COUNT(*) FROM student s, swim w WHERE s.id=w.id GROUP BY w.event
d)
Print a complete list of the Swimming Gala. The list should also show the students not taking part
in any event with "******". The list should be order by class and student name.
SELECT s.class, s.name, w.event FROM student s, swim w WHERE s.id=w.id UNION
SELECT class, name, "******" FROM student WHERE id NOT IN (SELECT id FROM swim)
AND class LIKE "1_" ORDER BY 1, 2
138
e)
List the students taking part in two or more events. [Self-join]
SELECT DISTINCT s.class, s.name FROM student s, swim w1, swim w2 WHERE s.id=w1.id
AND s.id=w2.id AND w1.event <> w2.event
f)
List the boys of each House taking part in the Swimming Gala but not taking part in 50m Back
Stroke, ordered by House and student name.
SELECT hcode, name FROM student WHERE sex="M" AND id IN (SELECT id FROM swim)
AND id NOT IN (SELECT id FROM swim WHERE event = "50m Back Stroke") ORDER BY 1,
2
<End of SQL Exercise 02>
SQL Exercise 03
1
There are two database tables, “book” and “borrow_record” for the library. Their structures are
shown below:
Book
Field Name
Type
Width
Dec
Description
bookID
Character
10
ID of the book
title
varchar
50
The title of the book
abstract
memo
/
The abstract of the book
Field Name
Type
Width
bookID
Character
10
ID of the book
dob
Date
/
Date of borrow
userID
character
10
ID of the borrower
borrow_record
Dec
Description
Answer the following questions:
a)
Why the field “title” in the table book would use the data type varchar? Should we change the field
bookid into the data type varchar?
The data type varchar can store the contexts of that field with variable length so that it
can save the storage of the computer. In this example, the length of the variable “title” is
variable with the ceiling 50 characters. It seems that varchar is much more flexible when
compared with the data type char of which the length is fixed. However, it is more efficient
if the length field is short (because varchar will contain overhead 2 bytes) or a field of
common field length (e.g. sex, phone number, etc), so, we should not change the data type
of the field bookID.
b)
Why the field “abstract” in the table book would use the data type memo?
The data type memo is not only variable in length but also it is unlimited in length (which
139
varchar will have a limit, say, 255 characters for ACCESS. So, memo is a suitable data type
for abstract.
c)
Now, you want to set the bookID in the table “borrow_record” as a foreign key with the reference
of the table book.
(i)
(i)
State the SQL statement needed.
(ii)
Under what conditions we cannot set the foreign key to another table?
ALTER TABLE borrow_record
(
ADD FOREIGN KEY (bookID) REFERENCES book
)
(ii)
The foreign key has to be mapped into the primary key of the reference book. So, if the
reference book (in this case, “book”) has no primary key set, the SQL statement cannot be
executed completely.
d)
Now, you want to set studentID as a foreign key to another table. Can we create two foreign keys
(bookID and studentID) to two different database tables?
Yes, we can form two different foreign keys in a table.
2
There is a database table “Enrollment”, the structure is shown below:
Enrollment
a)
Field Name
Type
Width
Description
courseID
character
5
ID of the course
studentID
character
5
ID of the student
group
Integer
/
The group number of the student he is attending
Now, you want to set the field courseID and studentID as the composite primary key of this table.
State the SQL needed (command: ALTER TABLE).
ALTER TABLE enrollment
(
ADD PRIMARY KEY (courseID, studentID)
)
b)
Under what condition we cannot set the primary key?
If there is some data inside the database table, and unfortunately, some of the records of
which the combination of courseID and studentID is not unique, which is the requirement
of a primary key, then, we cannot set it as the primary key by that SQL statement.
<End of SQL Exercise 03>
140
SQL Exercise 04
1.
i.
With reference to the records in the table info:
ITEM_NO
DESC
QTY
SAL_PRICE
CATEGORY
003
Bugle
5
1500
wind
004
Piano
3
13000
percussion
005
Violin
2
6500
strings
001
Trumpet
3
2500
wind
006
Saxophone
4
4000
wind
002
Tuba
4
4000
wind
007
Drum
3
18000
percussion
Create a new field called “amount” which is a numeric data with width=10 and 2 decimal places.
Then Update the field amount by multiplying the quantity (QTY) and the sale prices (SAL_PRICE).
ALTER TABLE info ADD amount numeric(10,2)
UPDATE info SET amount = qty*salary
ii.
Show a list of musical instruments in descending of their prices.
SELECT desc, sal_price FROM info ORDER BY 2 DESC
iii.
Show a list of ITEM_NO of which its’ SAL_PRICE is neither the highest nor the lowest.
SELECT item_no FROM info WHERE sal_price NOT IN (SELECT max(sal_price),
min(sal_price) FROM info)
iv.
What would be the output if the following SQL statement is executed:
SELECT desc FROM info WHERE desc NOT LIKE “%e” AND desc LIKE “%e%”
Trumpet
v.
Create a list that shows the categories of the musical instruments of which the total number of
quantity of that category is more than 10.
SELECT category, SUM(qty) AS cnt FROM info GROUP BY category HAVING SUM(qty) > 10
<End of SQL Exercise 04>
141
SQL Exercise 05
A library stores the information about the books in the following table:
BOOK.DBF
Field Name
Book_id
Title
Type
Date_pur
Author
ISBN
1.
Type
Character
Character
Character
Date
Character
Character
Width
4
40
40
8
40
20
Decimal
Create a list that shows the book title, type and author of which the book ID range from 2100 to 2160.
____________________________________________________________________
____________________________________________________________________
2.
Display the book titles which consist of the words ‘Plants’ or ‘Tree’. The book titles may be in upper or
lower cases.
____________________________________________________________________
SELECT title FROM book WHERE UPPER(title) = “PLANTS” OR UPPER(title) = “TREE”
____________________________________________________________________
____________________________________________________________________
3.
Display a list of book type, date of purchase and title by ordering the records by their Type. Within each
type, arrange the records in ascending order of Date_Pur.
____________________________________________________________________
____________________________________________________________________
4.
Count the number of books that were purchased on or before 01/01/1995.
____________________________________________________________________
____________________________________________________________________
5.
Show the title of the book of which the ISBN equals “0333469267”.
____________________________________________________________________
____________________________________________________________________
6.
Count the number of books by each book type.
____________________________________________________________________
____________________________________________________________________
7.
Find out the title of the book that has the shortest name.
____________________________________________________________________
____________________________________________________________________
142
____________________________________________________________________
8.
Modify the table to include a character field with the width=20, NewBook_ID, which stores the new book
ID of the book and its record should be unique. Update this new book ID according to the following table.
Book Type
Physics
Chemistry
Integrated Science
New Book ID
“P” + Old Book ID
“C” + Old Book ID
“I” + Old Book ID
____________________________________________________________________
____________________________________________________________________
____________________________________________________________________
____________________________________________________________________
____________________________________________________________________
9. Delete the records of those books which have been purchased for more that 50 years. (Note: You need to
compare the year of purchase with that of the system date and you may assume a year to have 365.25
days)
____________________________________________________________________
____________________________________________________________________
<End of SQL Exercise 05>
143
Past Paper Investigation
2004 – AS – CA #1
1.
A table is created with the following SQL command to store the subject scores of Chemistry (CHEM), Biology
(BIO) and Computer Studies (CS) of a class of students. REG_NO and EN_NAME represent the registration
number and name of a student respectively.
CREATE TABLE 4D (
REG_NO CHAR(6),
EN_NAME CHAR(30),
CHEM INTEGER,
BIO INTEGER,
CS INTEGER )
(a)
Modify the above SQL command so that two records with the same registration number cannot be input
into 4D.
(1 mark)
Reg_no CHAR(6) UNIQUE
OR
Reg_no CHAR(6) PRIMARY KEY
(b)
Write an SQL command to input the following record into 4 D.
Registration number:
S03108
Name:
Fok Chi Yuen
Chemistry:
73
Computer Studies:
88
(1 mark)
INSERT INTO
4D (reg_no, en_name, chem, cs)
VALUES
(‘S03108’, ‘Fok Chi Yuen’, 73, 88)
(c)
Because of a modification in the examination paper of Chemistry, all students will be awarded two
additional scores. Write an SQL command to increase the value of CHEM by 2 in each record.
(1 mark)
UPDATE
4D
SET
chem = chem + 2
(d)
Describe the purpose of the following SQL command:
DELETE FROM 4D WHERE LEN(TRIM(EN NAME)) = 0
(1 mark)
Remove all records which contain a null string in field en_name
2004 – AS – CA #7
7.
Below are two database files, DB1 and DB2, where the first row indicates the field names.
subject
staff_code
Staff_code
Staff_name
Chinese
01
01
May Au
144
(a)
English
03
Maths
05
02
Billy Ho
Apply equi-join on DB1 and DB2, and write the result in the space provided.
subject
Staff_code (DB1)
Staff_code (DB2)
Staff_name
(2 marks)
Answer:
(a)
subject
Staff_code (DB1)
Staff_code (DB2)
Staff_name
Chinese
01
01
May Au
Apply full outer join on DB1 and DB2, and write the result in the space provided.
Subject
Staff_code (DB1)
Staff_code (DB2)
Staff_name
(2 marks)
Answer:
subject
Staff_code (DB1)
Staff_code (DB2)
Staff_name
Chinese
01
01
May Au
.NULL.
.NULL.
02
Billy Ho
English
03
.NULL.
.NULL.
Maths
05
.NULL.
.NULL.
It could be not in order (Order of records is always not important.)
It should be exactly correct.
145
Revision Exercise 01
1.
How many tables will be needed to present the following relationships in 3rd normal form?
(i)
A
B
1
1
The number of tables needed is :
The foreign key is in table
(ii)
A
B
1
M
The number of tables needed is :
The foreign key is in table
(iii)
A
B
1
1
The number of tables needed is :
The foreign key is in table
(iv)
A
B
N
M
The number of tables needed is :
The foreign key is in table
(v)
1
1
The number of tables needed is :
The foreign key is in table
2.
For the following database schema,
employeeDepartment(employeeID, name, job, departmentID, departmentName)
What kinds of anomaly does it suffer? Briefly explain the scenarios of the anomalies.
There are 3 kinds of anomalies, they are insertion anomaly, deletion anomaly and
modification anomaly. For employees with the same department, the name of
department will be repeated several times, so, it is obviously redundancy here.
Most importantly, if the name of department for the departmentID =123 changed
from “Personnel” to “Public Relationship”, it results one change in every
employee in this department, so, it suffers modification anomaly. Also, you just
cannot insert a new department if no staff has been assigned to that department,
146
so it also suffers insertion anomaly. At the same time, when you want to delete
the last employee in that department, you not only delete the employee but the
whole department, so, it suffers deletion anomaly also.
3.
For the following database schema, what kinds of anomaly it does suffer?
Subject (SubjID, SubjName, TeacherInChargeName)
Teacher(TeacherID, TeacherName, TeacherSex, TeacherDoB, TearcherAdmissionDate)
If a subject chairperson “Mr. Wong” quit the job and “Ms. Cheung” takes the post,
then, we have to changed more than once for this piece of information. Therefore,
it suffers from modification anomaly.
4.
For a database schema
Marriage(Male, Female)
What kinds of anomaly does it suffer?
it suffers from insertion anomaly and deletion anomaly. It is because you cannot
insert a female if she is married to a man, so, it suffers from insertion anomaly.
Also, if you want to delete a particular man, you will delete another female
information, so, it suffers from deletion anomaly.
5.
(i)
Why do anomalies happen?
Because of data dependency and data redundancy
(ii) How to avoid anomalies?
Normalization.
6.
Transform the following M:N relation into multi M:1 relations.
M
N
Student
Take
1
Student
M
Make
Course
N
Application
1
Refer
Course
<End of Revision Exercise 01>
147
Revision Exercise 02
David is a database administrator in a famous chained bookstore, he designed a database table called
“book_info” with the following structure:
Field
a)
Type
Width
Decimal
Description
bookcode
character
8
/
The code for the book
title
character
50
/
The title of the book
publisher
character
50
/
The publisher of the book
author
character
100
/
The author the book
pub_date
Date
8
/
Date of publication
price
Numeric
6
2
The price of the book
discount
Numeric
3
2
The discount of the book
Write the SQL statement such that it will create a database table with the structure shown in above.
CREATE TABLE book_info (bookcode char(8), title char(50), publisher char(50),
author char(100), pub_date date, price numeric(6,2), discount numeric(3,2))
b)
Modify the database structure such that the discount will be set to 5% as default value.
ALTER TABLE book_info ALTER discount SET DEFAULT 0.05
c)
Modify the database structure such that the title of the book can never be null.
ALTER TABLE book_info ALTER title char(50) NOT NULL
d)
Modify the database structure such that the date of publication can never be as early as 1980/1/1. It will
give an error message to the invalid input.
ALTER TABLE book_info ADD CONSTRAINT cons1 CHECK
(pub_date > #1980/1/1#)
e)
Write down the SQL statement such that it will give the total number of books available in the bookstore
according to their publishers.
SELECT publisher, COUNT(*) AS cnt FROM book_info GROUP BY publisher
f)
David suggested that the database structure should be modified such that it can classify the books into
different categories. He suggested adding a field called category in the database table. Give your
opinions on his suggestions and can you give any suggestion?
If a new field is added to the table book_info, then, it will cause a problem if a book
can be classified into more than one categories. i.e. It does not allow a book to have
two or more categories. To solve the problem, we should create new tables
according to the categories, e.g. tables called science, geography, etc. in which the
bookcode stored.
<End of Revision Exercise 02>
148
Revision Exercise 03
David is a database administrator in a country club. He is responsible to design a database such that it can
facilitate the club members to book some services. He created a database file called “activities”, in this
database file, it has 3 tables called “member”, “facility”, “booking”.
a)
What are the differences between database file and database table?
The database file is used to contain the database tables. Database table is used to
store the data where database table will not be used to store data but the relations.
b)
Why David needs to create the database file, what kinds of features he cannot perform if he did not
create the database file but only created the database table?
Foreign key
David used the following SQL statements to create the 3 tables:
CREATE TABLE member (ID CHAR(9) PRIMARY KEY, name CHAR(25) NOT NULL, Vdate DATE, grade CHAR(10))
CREATE TABLE facility (Fcode CHAR(10) PRIMARY KEY, name CHAR(25), place CHAR(25), price INTEGER)
CREATE TABLE booking (Bdate DATE, Btime INTEGER, Fcode CHAR(10), MID CHAR(9))
c)
Roughly estimate the file size (in K bytes) of the table “member” if there are 10000 members in the
country club. Will the file size be less than, exactly equal to or greater than your estimation?
file size = 10000*(9+25+8+10)/1024 = 508 K byte
The file size should be a little bit greater than the estimated result, it is because it
will have an extra index file for the primary key “ID”.
Here is a brief description to the structure of the table “booking”:
Field
Description
Bdate
Booked date, i.e. it will record the date that the facility will be used by the member.
Btime
Booked time, i.e. it will record the time zone, ranging from 1 to 14. (1 represents 8:00 to 9:00, 2
represents 9:00 to 10:00, and so on.)
Focde
Facility code
Mid
Member id
d)
John (member id = 1011103) booked the tennis court (facility code = ten101) on April-20-2006 at the
time from 3:00 to 4:00 p.m. Write a SQL statement to insert this record to the table “booking”.
149
INSERT INTO booking (bdate, btime, fcode, mid) VALUES ({04/20/2006}, 8, “ten101”,
“1011103”)
e)
Since the field “Btime” in the table “booking” should only range from 1 to 14, how to use SQL statement
to modify the structure of the table “booking” such that it will avoid the invalid inputs.
ALTER TABLE booking ALTER btime SET CHECK (btime >= 1 AND btime <=14)
ERROR “Input out of range”
f)
Write a SQL statement such that it will produce a name list of members who have booked the facility
more than 5 times.
SELECT name FROM member WHERE id IN (SELECT mid FROM book GROUP BY mid
HAVING COUNT(*) >= 5)
g)
There are 3 different grades for memberships, the first one is “general”, the second one is “prestige” and
the third one is “VIP”. Now, the company wants to insert a field called “discount” in the table “member”
and so, give 50% off for VIP, 20% off for prestige and 10% off for general members.
(i)
Write a SQL statement such that it will insert a new field to the table “member.
ALTER TABLE member ADD discount numeric(3,2)
(ii)
Write a SQL statement such that it will set the discount to 50% if he / she is a VIP.
UPDATE member SET discount = 0.5 WHERE grade = “VIP”
(iii) Write a SQL statement such that it will give the total number of booking being made by the
members according to their grades.
SELECT member.grade, COUNT(*) AS cnt FROM member, booking WHERE member.id
= booking.mid GROUP BY member.grade
h)
Write a SQL statement such that it will produce a name list of members who has spent more than $1000
in the year 2005.
SELECT member.id FROM member, facility, booking WHERE booking.mid =
member.id AND booking.fcode = facility.fcode AND YEAR(booking.bdate) = 2005
GROUP BY member.id HAVING sum(facility.price) > 1000
<End of Revision Exercise 03>
150
Revision Exercise 04
1.
Which of the following is / are DDL?
(1)
CREATE TABLE
(2)
ALTER
(3)
INSERT INTO
A.
(1) and (2) only
B.
(1) and (3) only
C.
(2) and (3) only
D.
(1), (2) and (3)
2.
Which of the following is DML?
A.
SELECT
B.
CREATE
C.
ALTER
D.
DROP
Read the following statements and answer the question 3 to 7.
A doctor owns a clinic located in Causeway Bay. There are approximately 100 clients. The
doctor records the name, address and phone number of each client on a paper card. For each
interview, the doctor will retrieve the paper card of the client and record the date, symptoms,
diagnostic results and medications on the paper card. Assume the there is a set of well-known
symptoms used by doctors. An ER diagram is find out the overall structure of data during the
phase of conceptual data modeling. In the diagram, a relationship is set up between CLIENT
and INTERVIEW.3.
3.
The cadinalities on the side of CLIENT and INTERVIEW are
and
respectively.
A.
Optional, optional
B.
Optional, mandatory
C.
Mandatory, optional
D.
Mandatory, mandatory
4.
Which of the following can be included as an entity?
A.
date of interview
B.
paper card
C.
symptoms
D.
the clinic
5.
Which of the following should NOT be included as an attribute?
A.
A._name of client
B.
date of interview
151
C.
diagnostic results
D.
Causeway Bay
6.
The type of relationship in the direction from CLIENT to INTERVIEW is
A.
one-to-one
B.
one-to-many
C.
many-to-one
D.
many-to-many
7.
Which of the following should be an attribute included in INTERVIEW?
A.
A name of client
B.
diagnostic result
C.
name of doctor
D.
number of visits by a client
8.
Which of the following types of attributes should be resolved?
A.
key attribute
B.
non-key attribute
C.
single-valued attribute
D.
multi-valued attribute
9.
After resolution, which of the following should disappear from an ER diagram?
A.
1:1 unary relationship
B.
1:1 binary relationship
C.
1:M binary relationship
D.
N:M relationship
10.
A binary relationship involves
A.
two relationships
B.
two different entities
C.
two attributes
D.
more than two entities
Conventional Questions:
1.
A private tennis club has ten tennis courts that allow members to use. Booking from members is
accepted within one week before the tennis court is used. In each booking, each member can reserve
at most 3 tennis courts and the duration is a multiple of half-hour. Given that two entities are identified:
MEMBER
A member of the tennis club. Identifier is MemberID. Other attributes are Name of member and
Contact phone number
152
COURT
A tennis court to be used by members. Identifiers is CourtID. Other attributes includes
Location and Fee.
a)
Sketch an initial E-R diagram to show the relationship and cardinality between MEMBER and COURT.
MemberID
Name
Fee
Location
M
N
b)
COURT
book
MEMBER
Redraw the E-R diagram to include an entity BOOKING with attributes including Date and StartTime
of use and Duration of booking.
2.
The staff of a company will have a number of skills, for example:
StaffID
StaffName
Skill
001
John Smith
Access, DB2, FoxPro
002
Dave Jones
dBase, Clipper
003
Mike Beach
004
Jerry Miller
DB2, Oracle
005
Ben Stuart
Oracle, Sybase
006
Fred Flint
Informix
007
Joe Blow
008
Greg Brown
009
Doug Hope
Access, MSSqlServer
153
a)
Is the following database table 1st normal form? If not, how to modify the structure to form a 1st normal
form?
STAFF(StaffID, StaffName, Skill)
No, it is not 1st normal form. It should be changed into
Staff (StaffID, StaffName)
Skill (StaffID, Skill)
b)
If we treat Staff and Skill as two entities, then, construct the ER diagram for it.
c)
Given that any skill is come from a set of pre-defined skills by the company, then, how to change the
structure of the database schema to reduce redundancy?
Staff (StaffID, StaffName)
Skill_Info(Skill_ID, Skill)
StaffSkill (StaffID, Skill_ID)
Foreign key are being set at the table StaffSkill to the tables Staff and Skill_info to
ensure the data integrity
d)
How many tables will be resulted by the following Entity Relations:
(i)
1 to 1 (Mandatory : Mandatory)
(ii)
1 to 1 (Optional : Mandatory)
(iii)
M to N (Mandatory : Optional)
(iv)
M to 1 (Mandatory : Mandatory)
<End of Revision Exercise 04>
154
Revision Exercise 05 (SQL)
1.
A machinery company stores the parts information in a table with the following structure:
CLIENT
Field Name
Type
Part_no
Integer
Descript
Character
Qty
Integer
supplier
Character
Width
Description
Unique code for a part
20
Description of the part
Quantity of the part
20
Supplier of the part
Write SQL statements to fulfill the following requests. Whenever the columns are not specified,
you may use SELECT * …
a)
Produce a list of parts in ascending order of quantity.
SELECT * FROM parts
ORDER BY qty ASC
b)
Produce a list of parts that consist of the keyword ‘Shaft’ in the description.
SELECT * FROM parts WHERE descript LIKE ‘%shaft%’
c)
Produce a list of parts that have a quantity more than 20 and are supplied by ‘China Metals Co.’
SELECT * FROM parts WHERE qty > 20 AND supplier = ‘China Metals Co.’
d)
List all the suppliers without duplication.
SELECT DISTINCT supplier FROM parts
e)
Increase the quantity by 10 for those parts with quantity less than 10.
UPDATE parts SET qty = qty + 10
WHERE qty < 10
f)
Delete records with part_no equal to 879, 654, 231 and 234
DELETE FROM parts WHERE part_no IN (879, 654, 231, 234)
g)
Add a field “Date_purchase” to record the date of purchase.
ALTER TABLE parts ADD COLUMN date_purchase Date
h)
Make a copy of the table with only fields part_no and qty. Name the new table as PARTS2.
CREATE TABLE parts2 SELECT part_no, qty FROM parts
155
2.
A supermarket stores the payroll information a table with the following structure:
RESULT
Field Name
Type
Width
Dec
Description
Name
Character
20
Name of employee
Post
Character
20
Post of the employee
Rate
Numeric
5
2
Hourly salary rate
Hour
Numeric
6
2
Number of hours worked
The salary of each employee is calculated by multiplying the hourly salary rate with the number of
hours worked, i.e. Salary = Rate*Hour.
Write SQL statements to fulfill the following requests:
a)
Print a list of employees showing all the information as well as the salary.
SELECT name, post , rate, hour, rate*hour AS salary FROM payroll
b)
Print those employees with salary greater than HK$10,000
SELECT name, rate*hour AS salary FROM payroll
WHERE rate*hour > 10000
c)
Find the average salary and the maximum salary of employees in the supermarket.
SELECT AVG(rate*hour) AS average_salary,
MAX(rate*hour) AS Max_salary FROM payroll
d)
For each kind of post, find the total working hours of employees. Display only those posts with
average working hours > 8.
SELECT post, SUM(hour) FROM payroll GROUP BY post HAVING AVG(hours) > 8
3.
The records in tables setP and setQ are shown below:
SETP
SETQ
X
Y
X
Y
3
5
2
4
1
7
8
3
2
4
1
6
3
6
5
9
2
3
State the result of the following SQL statements:
a)
SELECT x FROM setp UNION SELECT x FROM setq
X:
1, 2, 3, 5, 8
.
156
b)
SELECT x FROM setp UNION SELECT y FROM setq
X:
c)
4.
1, 2, 3, 4, 6, 9
SELECT p.x, p.y FROM setp AS p, setq AS q WHERE p.x = q.x
X
Y
1
7
2
4
2
3
The staff information of a company is stored in a table with the following structure:
STAFF
Field Name
Type
Width
Dec
name
Character
20
name of a staff
department
Character
20
department of the staff
salary
numeric
9
2
Description
salary of the staff
Identify and correct the errors in the following queries:
a)
SELECT name
WHERE salary = (SELECT MAX(salary) FROM staff)
Missing the table name in the main query, so, the correction is:
SELECT name FROM staff WHERE salary =
(SELECT MAX(salary) FROM staff
b)
SELECT name FROM staff
WHERE salary = SELECT MAX(salary) FROM staff
Missing the parenthesis “(“ and “)” for the query, so, the correction is:
SELECT name FROM staff WHERE salary =
(SELECT MAX(salary) FROM staff
c)
SELECT COUNT(*) FROM (SELECT MAX(salary) FROM staff
WHERE salary > 10000)
Subquery can be used in the WHERE clause only, so, the correction is:
SELECT COUNT(*) FROM staff WHERE salary > 10000
d)
SELECT name FROM staff
HAVING salary = (SELECT MAX(salary) FROM staff
157
Subquery cannot be used in the HAVING clause, so, the correction is:
SELECT name FROM staff WHERE salary =
(SELECT MAX(salary) FROM staff
e)
SELECT name FROM staff
WHERE MAX(salary) IN (SELECT salary FROM staff)
Aggregate function (like MAX, AVG, MIN, SUM, COUNT) cannot be used directly in the
WHERE clause, so, the correction is:
SELECT name FROM staff WHERE salary IN
(SELECT MAX(salary) FROM staff)
g)
SELECT name FROM staff
WHERE salary = (SELECT salary FROM staff WHERE salary > 10000)
More than one record is returned form the sub-query, so, the correction is:
SELECT name FROM staff WHERE salary IN
(SELECT salary FROM salary > 10000)
5.
The result of an English Contest in a class are stored in a table with the following structure:
RESULT
Field Name
Type
Width
Description
Name
Character
20
Name of a competitor
mark
numeric
3
Mark of the competitor
Write SQL statements to find the following:
a)
highest, average and lowest marks
SELECT MAX(mark), MIN(mark), AVG(mark) FROM result
b)
competitor with the highest mark
SELECT name FROM result WHERE MARK =
(SELECT MAX(mark) FROM result)
c)
competitors with mark above the average
SELECT name FROM result WHERE mark >
(SELECT AVG(mark) FROM result)
d)
numbers of competitors with mark above the average
SELECT COUNT(*) FROM result WHERE mark >
(SELECT AVG(mark) FROM result)
158
6.
An ISP keeps the information of the clients in a table with the following structure:
CLIENT
Field Name
Type
Width
Description
user_id
Character
10
A unique code that identifies a user
password
Character
10
Password for the user
name
Character
40
Name of the user
profession
Character
30
Profession of the user
Identify and correct the errors in each of the following SQL statements:
a)
Task: A view PASSWORD is needed to show the user_id and password only for all the clients.
SQL: CREATE VIEW password SELECT * FROM client
CREATE VIEW password AS SELECT user_id, password FROM client
b)
Task: A view STUDENT is needed to hold all the information except the password of the clients
who are students.
SQL: CREATE VIEW student SELECT user_id, name, profession
WHERE profession=’Student’
CREATE VIEW student AS SELECT user_id, name, profession FROM client
WHERE profession = ‘Student’
c)
Task: A view PROF_CNT is needed to show the number of clients in each profession.
SQL: CREATE VIEW COUNT(*) FROM client GROUP BY profession
CREATE VIEW prof_cnt AS
SELECT profession, COUNT(*) FROM client GROUP BY profession
<End of Revision Exercise 05>
159
Revision Exercise 06 (SQL)
1.
STAFF
Field Name
Type
Width
Department
Character
20
Dec
Description
Name of a department: e.g. sales,
purchase, account
Name
Character
20
Date_birth
Date
Salary
Numeric
8
sex
Character
1
Name of the employee
Date of birth of the employee
2
Salary of the employee
Sex of employee: ‘M’ for male, ‘F’ for
female
Write SQL statements to fulfill the following requests:
a)
Produce a list to show the names of departments without duplicate lines.
SELECT DISTINCT department FROM staff
b)
Produce a list of all information in alphabetical order of name.
SELECT * FROM staff ORDER BY name
c)
Produce a list of all information in ascending order of age.
SELECT * FROM staff ORDER BY date_birth DESC
d)
Produce a sorted list of staff by name, classifying the staff in department with the male staff in
each department followed by the female staff.
SELECT department, sex, name, date_birth FROM staff
ORDER BY department, sex DESC, name
e)
Increase the salary by 5% for male staff of Sales Department.
UPDATE staff SET salary = salary*1.05 WHERE sex = ‘M’ AND department = ‘Sales’
f)
Remove all the records for staff of Account Department
DELETE FROM staff WHERE department = ‘Account’
2.
An insurance company stores the client information in a table with the following structure:
CLIENT
Field Name
Type
Width
Dec
Name
Character
20
Name of client
Sex
Character
1
Sex of the client (‘M’ or ‘F’)
Date_birth
Date
8
Date of birth of the client
Occupation
Character
20
Occupation of the client
premium
Numeric
8
2
Description
Premium of the client
Write SQL statements to fulfill the following requests:
160
a)
Produce a list of clients who were born on Feburary, March, June or September.
SELECT * FROM client WHERE MONTH(date_birth) IN (2, 3, 6, 9)
b)
Produce a list showing all the occupations of the clients without duplication
SELECT DISTINCT occupation FROM client
c)
For each year between 1970 and 1990, find the number of clients who were born in the same
year.
SELECT YEAR(date_birth), COUNT(*) FROM client
WHERE YEAR(date_birth) BETWEEN 1970 AND 1990 GROUP BY YEAR(date_birth)
d)
Find the average premium of female clients
SELECT AVG(premium) FROM client WHERE sex = ‘F’
e)
Classify clients by year of birth. Find the average premium for those groups with average premium
more than HK$500.
SELECT YEAR(date_birth), AVG(premium) FROM client
GROUP BY YEAR(date_birth) HAVING AVG(premium) > 500
3.
A school stores the activity records of the students in two related tables which have the following
structures:
ENROLLMENT
Field Name
Type
Width
Description
Name
Character
20
Name of a student
Club_id
Character
4
Unique code of a club enrolled by the student
Field Name
Type
Width
Description
Club_id
Character
4
Unique code of a club
name
Character
20
Name of the club
CLUB
A student may enroll on more than one club. Write SQL statements for the following tasks:
a)
Create a list containing the student name and the corresponding club name(s) for each student.
SELECT e.name, c.name FROM enrollment AS e, club AS c
WHERE e.club_id = c.club_id ORDER BY e.name
b)
Create a list containing the club name and the names of club members for each club.
SELECT c.name, e.name FROM enrollment AS e, club AS c
WHERE e.club_id = c.club_id ORDER BY c.name
c)
Create a list of club members for Computer club.
161
SELECT e.name FROM enrollment AS e, club AS c
WHERE e.club_id = c.club_id AND c.name = ‘Comptuer club’
4.
The club and activity information of students in a school is stored in the following tables:
ACTIVITY
a)
CLUB
Name
Club_id
Club_id
Club_Name
Janet Chan
02
02
Swimming
Isabella Wong
03
03
Violin
Quentin Cheung
02
Robin Kong
02
Robin Kong
03
Sidney Ah
03
State the result of the following SQL statement:
SELECT name FROM activity
WHERE club_id =
(SELECT club_id FROM club WHERE club_name = ‘Violin’)
Name
Isabella Wong
Robin Kong
Sidney Ah
b)
Rewrite the above SQL statement using INNER JOIN.
SELECT a.name FROM activity AS a, club AS c
WHERE a.club_id = c.club_id AND c.club_name = ‘Violin’
OR
SELECT a.name FROM activity INNER JOIN club ON
activity.club_id = club.club_id AND club.club_name = ‘Violin’
5.
The results of a public examination are stored in the following table:
Exam
Field Name
Type
Width
Description
Subject
Character
20
Name of a student
Num_credit
Numeric
4
Number of student with a credit (A-C) in the
subject
Num_pass
Numeric
4
Number of student passing the subject (D-E)
Num_fail
Numeric
4
Number of student failing the subject (F)
162
Write SQL statements for the following task:
a)
Find the passing percentage of each subject. Display the results accurate to 1 decimal place.
SELECT subject, ROUND(
(num_credit + num_pass) / (num_credit + num_pass + num_fail)*100,1)
FROM exam
b)
Find the subject with the highest passing percentage.
SELECT subject FROM exam WHERE ROUND(
(num_credit + num_pass) / (num_credit + num_pass + num_fail)*100,1) =
(SELECT MAX(ROUND((num_credit + num_pass) / (num_credit + num_pass +
num_fail)*100,1) FROM exam)
<End of Revision Exercise 06>
163
Revision Exercise 06 (SQL)
1.
The Education Bureau keeps the information about the schools in the tables as follows:
SCHOOL
Field Name
Type
Width
Description
Sch_id
Character
4
A unique code that identifies a school
School
Character
40
Name of the school
Principal
Character
40
Name of the principal of the school
telephone
Character
10
Telephone number of the school
Field Name
Type
Width
Description
Subj_id
Character
3
A unique code that identifies a subject
subject
Character
40
The name of the subject
Field Name
Type
Width
Description
Sch_id
Character
4
Code of the school that offers a subject
Subj_id
Character
3
Code of the subject
NumOfStud
Numeric
3
Number of students taking the subject
SUBJECT
OFFER
a)
Explain what a foreign key field is. State an example using the tables above.
A foreign key stores the field which forms a key field of another table. Therefore, a foreign
key can uniquely identify a record in another table. Example of foreign keys are sch_id and
subj_id in the table OFFER.
b)
State the primary key for each table.
SCHOOL, primary key: sch_id
SUBJECT, primary key: subj_id
OFFER, primary key: sch_id, subj_id
c)
Write the SQL statement to create the table SUBJECT and OFFER.
CREATE TABLE subject (
subj_id char(3) unique, subject char(40), PRIMARY KEY (subj_id))
CREATE TABLE offer (
sch_id char(4), subj_id char(3), numofstud numeric(3)
PRIMARY KEY (sch_id, subj_id))
164
d)
Write SQL statements for the following tasks:
(i)
Product a list showing those schools which offer the subject ‘Computer Studies’.
SELECT s.school FROM school AS s, subject AS j, offer AS o
WHERE s.sch_id = o.sch_id AND j.subj_id = o.subj_id AND
j.subject = ‘Computer Studies’
(ii)
Find how many subjects are available in ‘ABC school’.
SELECT COUNT(*) FROM school AS s, offer AS o
WHERE s.sch_id = o.sch_id AND s.school = ‘ABC school’
(iii)
Produce a list showing the subjects available in ‘ABC school’.
SELECT j.subject FROM school AS s, subject AS j, offer AS o
WHERE s.sch_id = o.sch_id AND j.subj_id = o.subj_id AND s.school = ‘ABC school’
(iv)
Find the total number of students taking the subject ‘Computer studies’ in HONG KONG.
SELECT SUM(numofstud) FROM subject AS s, offer AS o
WHERE j.subj_id = o.subj_id AND j.subject = ‘Computer Studies’
2.
The results of an inter-class English Contest are stored in a table with the following structure:
RESULT
Field Name
Type
Width
Description
Name
Character
20
Name of a competitor
Class
Character
2
Class of the competitor
Mark
Character
3
Mark of the competitor
Write SQL statements to find the following:
a)
highest, average and lowest marks in each class.
SELECT MAX(mark), MIN(mark), AVG(mark) FROM result
b)
students with the highest mark in each class.
SELECT name FROM result WHERE MARK =
(SELECT MAX(mark) FROM result)
165
c)
students with mark above the class average mark in each class.
SELECT name FROM result WHERE mark >
(SELECT AVG(mark) FROM result)
d)
students in 3A with mark above the overall average mark.
SELECT COUNT(*) FROM result WHERE mark >
(SELECT AVG(mark) FROM result)
3.
A fashion company keeps the stock information in the tables STOCK and DESIGNER. The
structures of the tables are shown below:
STOCK
Field Name
Type
Width
Description
product_id
Character
4
Unique code that identifies a product
designer_id
Character
4
Code of the designer for the product
type
Character
20
Type of the product
size
Character
1
Size of the product, may be ‘L’, ‘M’ or ‘S’
qty
Numeric
4
Quantity of the product
Field Name
Type
Width
Description
designer_id
Character
4
Unique code that identifies a designer
name
Character
20
Name of the designer
telephone
Character
10
Telephone number of the designer
DESIGNER
An example of product_id is ‘0034’. Different sizes of the same product will use different
product_id.
a)
State the primary keys for the above tables.
STOCK: Primary key: product_id
DESIGNER:
b)
Primary key: designer_id
Explain why the product_id is stored as characters.
Reasons for storing product_id as characters
1.
Leading zeros or spaces can bed added
2.
Calculation on product_id is rare
3.
Display of the numbers in product_id is not affected by the display format
4.
More efficient
166
c)
Explain why the information about the designers is stored separately.
A designer may have more than one product. Storing the designers separately can avoid
data redundancy. Otherwise, updating the stock records may lead to anomalies, i.e. errors
or inconsistencies.
d)
Write SQL statements for the following tasks:
(i)
Produce a list showing the total quantity of each design.
SELECT SUM(qty) FROM stock
GROUP BY product_id
(ii)
Find the total quantity for the type ‘Pullover’.
SELECT SUM(qty) FROM stock
WHERE type = ‘Pullover’
(iii)
Produce a list showing the product_id of the designer ‘Timothy’, without duplicating rows.
SELECT DISTINCT product_id FROM stock AS s, designer AS d
WHERE s.designer_id = d.designer_id AND d.name = ‘Timothy’
(iv)
Find which design of size ‘M’ has the largest quantity.
[Hint: A design is identified by product_id]
SELECT product_id FROM stock WHERE size = ‘M’ AND qty =
(SELECT MAX(qty) FROM stock WHERE size = ‘M’)
<End of Revision Exercise 06>
167
Revision Exercise 07 (SQL)
1.
In the Hong Kong District Council Election, the information about the districts and candidates are
stored in the following tables:
DISTRICT
Field Name
Type
Width
Description
Dist_id
Character
4
A unique code that identifies a district
Distric
Character
20
Name of the district
VoterNum
Numeric
7
Number of voter of the district
Field Name
Type
Width
Description
Candidate
Character
4
Name of the candidate
Dist_id
Character
4
The district code of the candidate
NumOfVote
Numeric
7
Number of votes obtained by the candidate
CAND
a)
For each of the following tasks, determine whether the SQL statement can fulfill the task. If not,
rewrite the SQL statement.
(i)
Task: Produce a list showing all the districts
SQL: SELECT DISTINCT dist_id FROM cand
Incorrect. District names are not displayed. The corrected statement is
SELECT district FROM district
(ii)
Task: Produce a list of candidates in Hong Kong East.
SQL : SELECT candidate FROM cand AS c, district AS d
WHERE district = ‘Hong Kong East’
Incorrect. Join condition is missed. The corrected statement is
SELECT candidate FROM cand AS c, district AS d
WEHRE c.dist_id = d.dist_id AND district = ‘Hong Kong East’
b)
State the meaning of the following SQL statements:
(i)
SELECT AVG(VoterNum) FROM district
Find the average number of voters among all the districts in Hong Kong
(ii)
SELECT AVG(NumOfVote) FROM cand GROUP BY dist_id
Find the average number of votes among all candidates in each district
(iii)
SELECT dist_id FROM cand GROUP BY dist_id HAVING COUNT(*) > 2
Find the codes for the district which has more than 2 candidates
168
d)
Assume that each voter can vote for one candidate only. Write SQL statements to find the
following figures:
(i)
the number of districts in Hong Kong
SELECT COUNT(*) FROM district
(ii)
the total number of voters in Hong Kong
SELECT SUM(VoterNum) FROM district
(iii)
the number of candidates in each district
SELECT COUNT(*) FROM cand GROUP BY dist_id
(iv)
the total number of voters who have voted in each district
SELECT SUM(NumOfVote) FROM cand GROUP BY dist_id
(iv)
the percentage of voters who have voted in Hong Kong.
SELECT SUM(NumOfVote) / VoterNum*100 FROM cand AS c, district AS d
WHERE c.dist_id = d.dist_id GROUP BY c.dist_id
<End of Revision Exercise 07>
169
Past Paper on Database
2001 – AS – CA #2
2.
John wants to design a database DB to store information about his friends. Therefore, he designs a
database file FRIENDS with the following structure:
(a) (i)
Although HKID is unique to each person, John cannot use it as the primary key. Explain why
not.
It is because the “if any” in description indicated that HKID is not a
mandatory data item
(ii)
John wants to define a primary key involving FIRST_NAME and LAST_NAME. Describe the
procedure that John should follow.
Create a new field by First_Name + Last_Name and define that field
as key or Define a composite key “Last_Name + First_Name” is also
acceptable
(iii)
Give an example where the primary key in part (a)(ii) may not be valid.
John’s friends may have the same given name as well as surname
(b)
John uses another database file SCHOOL in database DB to store the school codes and school
names. The contents of SCHOOL are shown below.
SCHOOL_ID
SCHOOL_NAME
081
Eden College
252
Hong Kong Number One Primary School
375
The Hong Kong Government School
441
Olympian Secondary School
782
Hong Kong Iciban Secondary School
956
Intensive Middle School
The contents of FRIENDS are shown below.
FIRST_NAME LAST_NAME HOME_TEL SCHOOL_ID …
Amy
Chan
25258123
252
…
Johnny
Chan
25532152
780
…
Chris
Cheung
25787523
441
…
Mary
Lam
24545510
Paul
Ng
25458648
…
252
…
170
Joe
Yeung
28585656
441
…
Do the above database files violate the integrity of the database DB? Explain.
(3 marks)
Yes
It is because the school_ID 780 in school
Is missing
2002 – AS – CA #5
5. Ms. Wong is conducting a survey on the service of the school tuck shop by doing the following steps:

collecting completed written questionnaires from students;

inputting the data into a computer; and

presenting the result of the survey using a presentation graphics software package.
She finds that there are a lot of mistakes on the completed questionnaires. One of the questionnaires with
mistakes is shown below.
1000
Ms. Wong now decides to have a new arrangement so that the students can fill in the questionnaires
online.
(a)
Explain how the online input can help Ms. Wong to improve the following:
(i) the completeness of data collection
Validate the presence of input data for mandatory fields (e.g. the sex field on the
questionnaire) /Check the number of selection, e.g. at most 3 items should be
selected
(ii) the correctness of data collection
Validate the range of data for the correctness of data, e.g. check the number of
purchase
Validate the format of data for the correctness of data, e.g. check the date format
(b)
Give a reason to justify Ms. Wong’s new arrangement in addition to the improvements given in Part
(a).
171
Reduce the time needed for data input (other reasonable answers)
(c)
Ms. Wong decides to employ a programmer to develop the system rather than to buy an existing
software package available on the market. Give TWO reasons to support Ms. Wong’s decision.
Satisfy unique requirements
Future modification or enhancement is more possible
(d)
Ms. Wong only wants to use touch screens for students to input data. Describe how the students can
fill in the numerical items in the questionnaire.
Use the numeric pad on the touch screen
2003 – AS – CA #2
2
(b)
Users sometimes make mistakes when keying in data into a database. Suggest two possible measures
that can be considered when designing the database in order to minimize these mistakes.
Set a field as primary key and set some fields to be unique
Specify validation rules for some fields
Set input mask for some fields
Set mandatory fields to reject null entry
(any two)
How to make a field become primary key or unique, we can do it by the command
“CREATE TABLE” or “ALTER TABLE”, remember, “Primary key” have to be
handled in database level instead of table level. i.e. We cannot change a field of a table
if the table is in a database, if the table is not in a database but it is just a single table,
we cannot define it as the primary but unique only. Primary key implies the
properties of uniqueness. E.g.
ALTER TABLE info ALTER stu_id char(10) PRIMARY KEY
We can set some validation rules to some fields by the command CREATE TABLE or ALTER
TABLE. E.g.
CREATE TABLE result (stu_id char(10), test numeric(4,0),
exam numeric(4,0) SET CHECK exam >= 0 ERROR “Positive integer only!”
(b) (i) Compared with the character data type, state one advantage of defining a field as the memo data
type.
A memo data type provides storage space for text information of variable
length to avoid unnecessary waste of storage space.
(unlimited / insert graphics / separate file)
(ii) Describe a situation in which it is more appropriate to define a field as the character data type rather
than the memo data type.
Justification: When the text information in the field is very short (e.g. less
172
than 4 characters) OR the length of the text information is limited OR many
complicated string manipulations (e.g. sorting, calculation) are required, it
is more appropriate to define as character data type.
(2 marks)
2003 – AS – CA #4
4.
The following table shows the structure of a database file STUDENT containing the records of all students in a
school.
Field name
Type
Width
CLS_NAME
Character
2
Class name (e.g. 1A, 2E, 4D)
CLASS_NO
X
2
Class number
EN_NAME
Character
25
Student name in English
Date
8
Date of Admission in format mm/dd/yy
Character
Y
Login name of School Intranet System
DOA
LOGIN_ID
Description
For each of the following cases, write suitable statement(s) (SQL / database commands) to generate a login
name for each student and store the login name into the field LOGIN_ID.
a)
X represents Character.
Y represents 4.
The first two characters of LOGIN_ID are the class name of the student.
The last two characters of LOGIN_ID are the class number of the student.
Example: For a 5C student with class number 08, his/her LOGIN_ID should be ‘5C08’.
(2 mark)
UPDATE
student
SET
login_id = cls_name + class_no
b)
X represents Numeric value without decimal places.
Y represents 4.
The first two characters of LOGIN_ID are the class name of the student.
The last two characters of LOGIN_ID are the class number of the student.
Example: For a 5C student with class number 8, his/her LOGIN_ID should be ‘5C08’.
UPDATE
student
SET
login_id = cls_name + STR(class_no)
WHERE
class_no >=10;
UPDATE
student
SET
login_id = cls_name + ‘0’+STR(class_no)
WHERE
class_no <=9
c)
X represents Character.
173
Y represents 6.
The first two characters of LOGIN_ID are the year of admission.
The next two characters of LOGIN_ID are the class name of the student.
The last two characters of LOGIN_ID are the class number of the student.
Example: For a 5C student with class number 08 who was admitted on 09/01/97, his/her LOGIN_ID should be
‘975C08’.
UPDATE
student
SET
login_id = RIGHT(STR(YEAR(doa)),2) +cls_name + class_no
2005 – AS – CA #6
6.
Mr. Chin, the Extra-curricular Activities Master of a school, uses the database files, CLUB and MEM, to
store the information of clubs and members of those clubs respectively. At the end of a school year,
students' testimonials will show their extra-curricular activities records which are retrieved from these
files.
The records in CLUB and MEM are listed as follows:
CLUB
ClubNo
ClubName
Teacherlncharge
001
Science Club
Mr. Chan
002
Mathematics Club
Ms.Ng
003
English Club
Mr.Fok
004
Fencing Club
Ms. Chau
005
Tennis Club
Mr.Fong
006
Drama Club
Ms. Yau
007
Volleyball Club
Mr. Tsoi
008
Basketball Club
Ms. Lee
009
Football Club
Mr.Sung
010
Chess Club
Ms. Lau
MEM
StudentI
StudentName
Class
ClassNo
Telephone
ClubNo
00001
Chan Chun Yin
1A
3
91632589
003
100003
Chan Ka Ho
1A
5
26739876
004
100003
Chan Ka Ho
1A
5
26739876
008
:
:
174
1700135
Wong Wai
7A
30
61388792
010
There are 10 clubs in the school and thus 10 records in CLUB. The key field of CLUB is ClubNo. The
two files are related by ClubNo.
One day, a student helps Mr. Chin to input the following new record into MEM:
StudentID
StudentName
Class
ClassNo
Telephone
ClubNo
100003
Chan Ka Ho
1A
5
26739876
012
a) Should the database system accept the above record? Explain briefly.
(1 mark)
No. There is no such club with ClubNo 012.
<Here, we should know it violates the integrity of the database. In fact, even more, you should be
enable to make the database to avoid this kind of problem. How? Use foreign key and properly
set the criteria like, the inputted values of a field which have to be matched with the primary key
value in the parents table. Of course, there are more features in foreign key, revise it if needed.
b)
During the school year, Ms. Chau, the teacher in charge of the Fencing Club, leaves the
school. No other teacher in the school is suitable to lead the Club so it has to be closed.
Suggest a modification to the structure of the database file(s) so that the existing clubs in the
school can be shown.
Add a logical field in the CLUB to indicate whether the club is active or not.
<e.g.
c)
ClubNo
ClubName
TeacherInCharage
Active
012
Football
Chan Tai Man
.T.
In the school, each student can join several clubs. Each time a student joins a club, there will
be a record in MEM.
(i)
State two drawbacks on this arrangement of MEM.
1.
data of a student may appear more than once in the file MEM /
longer data entry time (Data redundancy)
2.
when the data of a student changes, all records related to the
student should have to be updated.
3.
any incomplete updating causes data inconsistency
4.
wastes storage space / longer access time
(ii)
Suggest a primary key to MEM.
StudentID & ClubNo (Class & ClassNo & ClubNo)
<- Composite key.
2006 – AS – CA #6
6.
A school uses a database file, STU, to store information about students, as follows:
STU
175
Field name
a)
Type
Width
Description
Example of data
SNAME
character
20
Name of the student
Wong Lai Mei
SCLASS
character
2
Class of the student
3D (Form 3, class D)
SNO
character
2
Class number of the student
38
STUID
character
6
Student code
501478
Give two conditions for a field that can be used as a primary key.
Unique and mandatory (non-empty, not null)
<- I think all of you should answer this question. Because it just requires some
fundamental knowledge (Even though you don’t know the word mandatory, you should
know it to be non-null.) In fact, the question itself should not use the word ‘Condition’,
unique and mandatory is not condition, they are properties. The condition for the
primary key is “It is the field that uniquely identifies every single entry of the database
table.” Therefore, sometimes, do not think it too seriously for the questions, try your
best to use the knowledge in the book to answer the questions.
<- Like primary key, the condition (not property) for INDEX is speed up the data
searching for some frequently used fields which can compensate the workload when
data is updated. (Always remember that if too many indexes for different fields in a
table, a single data entry modification would result a lot of workload for the indexing.
So, usually, apart from the primary key, primary key by fault would always be indexed,
there are at most one or two more fields to be indexed.) Here, you should know that the
properties of the indexed field may not necessarily be unique. i.e. repeated values can
be indexed. If in case you do not understand this paragraph, come and ask me.
<- If this time, the question ask you ‘What is the condition for creating a VIEW in a
database?’, what would you answer?
First of all, I would have to admit that I do not the focus point for this question (and it
happens always in the questions in ASCA) and I have nothing on my mind, but anyway,
I would use the knowledge of the book to answer the question. The answer I would give
would be:
A view is a virtual table forming by one or more tables existing in the database, so,
every condition applied to creating a table would be suitable for a view. E.g. a
primary key field should be present. Since the data in the view is come from other
existing tables, so, other table should be present. Also, a view is used to facilitate
different users can access partial data in different tables, (access here means read and
write and modify) so, changes in the view should result in the parents table.
<- See, just try to use the knowledge in the book and then answer the question and then
you can get the mark.
b)
Suggest a primary key for STU.
STUDID / SCLASS + SNO / SNAME + SCLASS + SNO + STUID
176
<- Since primary key (or candidate key) should be the minimal number of combination
of field that uniquely identified each entry in the database table. So, basically, only
STUDID should be considered as the ‘appropriate’ choice for the primary key.
Another database file, EXAM, is used to store the students’ examination results.
EXAM
Field name
Type
Width
Description
Example of data
STUID
Character
6
Student code
501478
SUBID
Character
2
Subject code
CH (Chemistry)
MARK
Integer
2
Exam mark of student STUID in subject
78
SUBID
c)
Suggest a primary key for EXAM.
STUID + SUBID / STUID + SUBID + MARK
<- STUID + SUBID + MARK should be considered as super key but not an appropriate
choice for the primary key.
d)
Assume that EXAM is related to STU.
(i)
Can STUID in EXAM be used as a foreign key? Explain briefly.
Yes, it can uniquely identify the records in STU. (other descriptive
statements)
(ii)
Write down a SQL command to create EXAM with the primary key in part (c).
CREATE TABLE EXAM(
STUID CHAR(6),
SUBID CHAR(2),
MARK INTEGER(2),
PRIMARY KEY (STUID, SUBID))
Note: (UNIQUE  PRIMARY KEY)
Do you still remember how many data types for the database, here is some examples:
char(n)
numeric(n, m)
date
logical
memo
integer
177
Appendix 1: databases and web server and web applications:
Web application and database relations
When we are going to create some server side application, we usually have to deal with some databases. E.g.
An online forum or an online multiple choice system would be some common applications. They are usually
done with the following process:

1.
Client computer
(HTTP request)
Description: Client computer sends a HTTP request
by using the web browser to the web server.
(Usually, the client computer will send the url (e.g.
http://abc.com), then, the DNS server will resolve the
domain name into its IP address)
Web Server
Inside the web server, …
2.
The client
computer will wait
The web server will execute the server application
for the response.
(they are some program codes. These program
codes will try to connect to the databases reside on
the server. Usually, it is done by using some
database engine, e.g. Microsoft Jet 4.0) It is the
engine which performs the SQL request.



Server application
Database Engine
The databases.
You should note to the
direction of the data flow.
3.
…
The server application will then generate the HTML
codes according to the information obtained by the
databases. Then, it will send the web pages (HTML
Now, the web server will have
codes to the client computer. After downloading the
nothing to do but to listen to
web pages and its corresponding multimedia files, it
the network to see if it can
will be interpreted and displayed by the web
serve any other client
browser.
computers.
Since it involved with several steps, so, we will investigate it one by one.
Step 1:
A web server (it is in fact a software program) has to be built to listen to the network and handle the
HTTP request accordingly. In the market, there are two common web servers which are widely used
and free of charge. They are IIS and Apache (Open source). These web servers provide a framework to
178
deal with the HTTP requests from client computers. i.e. If a client asked a specific file (web page or a
multimedia file), these web servers will arranged the time to deliver those data.
Step 2:
Apart from static web page, a web server is supposed to provide dynamic web page, i.e. it should be
able to perform some server side applications. IIS has its server side programming built in, it is
called .asp or .aspx(asp.net). However, for Apache, one has to install the PHP to make the web server
be able to execute the server side programs, the common files are .php. Last of all, it is not enough to
have just the server side programs, an database engine has to be installed to provides the interface or
to execute the query from the application to the database. IIS has its engine installed. However, for
apache, MySQL has to be installed to get the connection to the database.
*The details of setting will be discussed in the section of setting up a web server.
Step 3:
After the delivery of the web page, basically, there is no connection between the web server and the
client computer any more. It is the web browser’s responsibility to interpret the HTML code and
displayed it on the screen. That is why the same web page from the same web server will look
differently in two different web browsers.
Setting up a web server
Since IIS is easier to be installed (both the server side programs and the database engine included), so,
we will use IIS as the demonstration.
In Windows XP or Windows 2000, IIS is regarded as one component. So, to install an IIS, all you have
to do is:
1. Call “Add / Remove Program” in Control Panel.
2. Select Install “Windows components” and check the box of “Internet Information Services (IIS)” in
the dialog box as shown below:
3. Insert the installation CD (Windows XP prof. edition SP2) and then the IIS is installed.
179
4. To setup the web server, we can call the msc by control panel -> administrative tools -> Internet
Information Services. From it, can call modify the property of the web server.
5. First of all we should add “index.htm”, “index.asp” or “index.html” as the default home page for the
web server.
6. Now, we can create a web page called “index.htm” and put it in the root of the web server, which is
in fact in the path “C:\Inetpub\wwwroot\”.
7. Then, if you’ve created the web page with the name “index.htm” and put it in the correct folder and
set the default home page as “index.htm”, then, you should find your server working. You can test it
by launching the HTTP request with the following URL in your browser http://127.0.0.1/
For a web server, it should be opened to the public. So, if you are using a real IP, say
212.44.55.66, you can access to the web server from a remote computer through the Internet
with the following URL: http://212.44.55.66/. However, if you have a firewall, then, you should set
the firewall properly otherwise, remote computer cannot connect to your web server. As you know,
let your server being connected from the outside world is like being hacked, so, usually, we will
only allow several ports being opened to the public. For the firewall in windows XP, the firewall
should be set as
180
control panel -> network -> local area network -> (right click and select properties) -> Advance ->
Edit the firewall.
Usually, we should turn on the firewall and allow some exception, e.g. HTTP port, we can add a
port (HTTP, port number 80) is being opened. By doing so, the web server can be opened to the
public and it tries to use port number 80 for the communication. So, by now, the remote client
computers are supposed to be able to connect to the web server with the URL: http://212.44.55.66.
If your IP address is fixed, you can register a domain name and mapped the domain name with
your IP address to hold a web server. E.g. http://abc.com -> http://212.44.55.66.
Sometimes, we can set a FTP server and direct the FTP root to the root of the web server for
updating…
Developing server side application by using server side programming
We can develop PHP programs in Apache where ASP programs in IIS. For simplicity, we will only focus
on developing ASP programs.
To tell the web server that it is an ASP program, the first line of the asp file would be:
<%@LANGUAGE=”VBSCRIPT” CODEPAGE=”950”%>
Also, all the program codes should be within <% and %>.
To get the connection to the database, here are the statements required:
Set Conn = Server.CreateObject(“ADODB.Connection”)
<- Create a variable to hold the connection
conn.Provider=”Microsoft.Jet.OLEDB.4.0”
<- Define the database engine used
conn.Open(Server.Mappath(“db/project.mdb”))
<- Set the path of the database file
set RS = Server.CreateObject(“ADODB.recordset”)
<- Create a variable to hold the query(SQL)
To interact with the database, we should use SQL, so, we have to define the SQL statement and then
execute the SQL and store the result into the recordset RS.
sql=”SELECT * FROM student”
<- Define the SQL statement
181
RS.Open sql, Conn
<- Execute the SQL
To ensure there will not be no result has been outputted by the SQL statement, it is common to put it
this way:
If not RS.eof then
<- To test if RS is End of File (eof)
RS.movefirst
<- If not, points to the first selection in the RS
Do
<- Start the DO looping
……
Rs.movenext
<- Next selection in the RS
Loop until RS.eof
<- Quit condition would be end of file
End if
After finish using the recordset RS, we should end our connection:
RS.close
conn.close
Set RS = nothing
Set conn = nothing
The above shows the core of asp statement to get access to the database. Now, we go on to study how
the project can be implemented.
Developing the project:
This project requires an online learning platform. My proposed
idea is to let the user login the system and then, give a number
of choices, say MC questions, online polling or forum, etc. So,
first of all, an index.htm would have to be created.
Login page is shown on the left: , after pressing the button, it
will pass the information to the server side application called
“logon.asp”.
Here, we should pay attention to two points,
1)
How would information be passed to another web page (web application)?
2)
How would the web application distinguish different information?
<Form name=”MyForm” action=”logon.asp” Method=”Post”>
-> Define the application to logon.asp
Username: <INPUT TYPE=text name=”username”><br>
-> Name the first textbox to username
Password: <INPUT TYPE=”password” name=”passw”><br>
-> Name the second textbox to passw
<INPUT TYPE=”submit” VALUE=”Submit Form”>
-> Note the type of button is submit
</Form>
*We need to use a <FORM> to submit
several data to an web application.
After submitting the information to the web application “logon.asp”. The program can use the following
codes to get those data from the form in the index.htm.
temp_user = Request.form(“username”)
Get the variable username in the form in the index.htm. In
the above example, temp_user will hold the value “03002”.
182
temp_password = Request.form(“passw”)
By using the variable temp_user, we can construct the SQL statement to find the password in the
database.
sql = “SELECT password FROM student WHERE studID = ‘” & temp_user & “’”
In fact, the SQL statement should look like:
SELECT password FROM student WHERE studID = ‘03002’
Below shows the details of the program code to
<%
temp_user = Request.form(“username”)
<- Get the username from the previous
form
Set Conn = Server.CreateObject(“ADODB.Connection”)
<Set
conn.Provider=”Microsoft.Jet.OLEDB.4.0”
the
conn.Open(Server.Mappath(“db/project.mdb”))
connection
set RS = Server.CreateObject(“ADODB.recordset”)
->
sql = “SELECT password, name FROM student WHERE
<- Define the sql statement, to find out the
studID = ‘” & temp_user & “’”
password of that particular studID.
RS.Open sql, Conn
<- Execute the SQL and store the result in
the variable RS
if RS.fields(“password”) <> request.form(“passw”) then
response.Redirect(“wrong_input.htm”)
end if
<- Test if the passw from the previous
form equals to the password in the
recordset RS, if not, redirect to a page
called wrong_input.htm
RS.close
<-
Conn.close
Clear the variables RS and conn
Set RS = Nothing
Set Conn = Nothing
->
%>
If the password of that particular studID is correct, then, it will stay (not redirect to wrong_input.htm),
then, a form is shown below:
183
At this page, select MC exercise 001 and press the corresponding button GO!, you will find a online MC
question appears as shown below:
As you can expect, there would be at least two forms, one for the MC and the other for the polling. For
the MC, the HTML code would look like:
<FORM action=”domc.asp” method=”post”>
And we can use the following codes to define the combo box for the MC
<SELECT name=”mc” >
<- Define the selection button
<%
Set Conn = Server.CreateObject(“ADODB.Connection”)
<Set
conn.Provider=”Microsoft.Jet.OLEDB.4.0”
the
conn.Open(Server.Mappath(“db/project.mdb”))
connection
set RS = Server.CreateObject(“ADODB.recordset”)
->
sql2=”SELECT DISTINCT exID FROM mc”
<- Define the SQL to find exID
RS.Open sql2, Conn
<- Execute the SQL
If not RS.eof then
RS.movefirst
Do
Response.write “<option value=’” &RS(“exID”) & “’>
“ & RS(“exID”) & “</option>” &chr(13)
Rs.movenext
<- To input the exID into the selection
button, each option should be given a
corresponding value
Loop until RS.eof
End if
%>
</SELECT>
With the codes above, the web application domc.asp can get the MC exercise number by using the
code “temp_exID = request.form(“mc”).
However, how does it know which user (studID)? And do we need to check the login password again?
In fact, we can use a method called session to handle it, it can be done in the logon.asp:
if RS.fields(“password”) <> request.form(“passw”) then
response.Redirect(“wrong_input.htm”)
184
<- If the password is correct, then
else
define sessiond “authorized” and
Session(“authorized”) = “true”
Session(“user”) = request.form(“username”)
“user” and set some value to them
end if
->
Then, in the domc.asp, all we have to do is not check the password all over again, but use a simple
statement like this:
Note that the case is sensitive here.
If Session(“authorized”) <> “true” then
Response.Redirect “index.htm”
End If
At last, there should be a number of answer boxes, so, the following program codes is required to
generate the names of the answer boxes.
no_record = 0
If not RS.eof then
RS.movefirst
Do
no_record = no_record + 1
Response.write “<tr><td>” & RS(“queID”) & “</td>”
Response.write “<td>” & RS(“question”) & “</td>”
Response.write “<td>” & RS(“choice1”) & “</td>”
Response.write “<td>” & RS(“choice2”) & “</td>”
Response.write “<td>” & RS(“choice3”) & “</td>”
Response.write “<td>” & RS(“choice4”) & “</td>”
Response.write
“<td><input
value=’A’
type=’text’
name=’”& RS(“queID”) & “’></td></tr>”
Rs.movenext
<- Define the answer box to the name of
the queID
Loop until RS.eof
End if
response.write(“<input type=’hidden’ name=’exID’ value=” &
<- Pass a object value “exID” to the next
temp_mc & “>”)
page.
session(“no_records”) = no_record
To write data into the database, one should set the authority of users
to have write property. We can highlight the database file or the
database folder, then select the choice property. Then, we can
select security, and then add a new account, say, everyone.
185
Then, you can set the everyone to have full control to the folder db
(i.e. include the right to write).
Now, the group account (everyone) enable any users will be
granted a right to write / modify data in the folder db, i.e. the
database file project.mdb. So, now, we can make use of the asp
program code to update data in the database.
In the asp program code, we have to set the recordset’s property to have the write property, here is the
statement required.
RS.CursorType = 2
RS.LockType = 3
And data can be assigned as follows:
RS.Addnew
RS.fields(“exID”) = Request.form(“exID”)
RS.fields(“studID”) = Session(“user”)
RS.fields(“answer”) = Request.form(Cstr(counter))
Cstr is a function to convert a value into a string, it is required because the field “answer” in the database file
MC_result is set to be text.
At last, the update process can be finished by the following codes:
RS.update
If in case you want to have those web application (the program codes and HTML code), you can download it
here: http://www.yll.edu.hk/~yll-cym/ca_web.zip
In fact, for easier updating web application, one should set up a FTP server and open the root of the web
server to make the updating easier. GoldenFTP is a freeware to do so, you can download here:
http://www.yll.edu.hk/~yll-cym/goldenftp/goldenftp.zip
<End of Appendix 1>
186