Download Introduction to Relational Database

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft SQL Server wikipedia , lookup

Oracle Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

IMDb wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Concurrency control wikipedia , lookup

Ingres (database) wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Functional Database Model wikipedia , lookup

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
Introduction to Relational Database
David Gerbing
School of Business Administration
Portland State University
Table of Contents
SECTION I
BASIC RELATIONAL DATABASE CONCEPTS ..... 1 Introduction............................................................................................................................................................ 2 Database Management Systems ......................................................................................................................... 2 History................................................................................................................................................................ 3 Database Structure ................................................................................................................................................ 4 Tables ................................................................................................................................................................. 4 Limitations of Representing Disparate Data Types with a Single Table ........................................................... 5 One-to-Many Relationships across Multiple Tables.......................................................................................... 7 Forms .................................................................................................................................................................... 12 Database Analysis ................................................................................................................................................ 13 Queries ............................................................................................................................................................. 14 Reports ............................................................................................................................................................. 14 Example: The Invoicing Database ..................................................................................................................... 16 Invoicing Structure........................................................................................................................................... 16 OrderLine subform........................................................................................................................................... 18 Database Design and Use .................................................................................................................................... 19 References ............................................................................................................................................................. 21 Further Reading ............................................................................................................................................... 21 Endnotes ........................................................................................................................................................... 21 © David W. Gerbing, 2013
Section I
Basic Relational Database Concepts
This section introduces the basic vocabulary and central concepts
of a relational database, including the tables that store the data and
the one-to-many relationships that link the data across tables. This
section also introduces the data analysis techniques of queries and
reports. These concepts are applied to the description of an
invoicing database.
Relational Database
Introduction
Databases store and organize data. The names, addresses, and phone numbers
in a phone book constitute a database, as does a collection of recipes stored on
a set of index cards.
A database is a stored, structured, integrated and retrievable set of
data.
Today most organizations and businesses store their databases on computer
systems. Computerized databases include accounting applications such as
invoicing, accounts receivable, and accounts payable. Other business
examples include sales management applications organized around lists of
customers and potential customers, inventory control, and the pictures and
descriptions of a company’s products available for sale over the Internet.
Database Management Systems
The concept of a DBMS. An application program and related files for managing
a computerized database form a Database Management System (DBMS). As
Table 1 illustrates, a DBMS allows a user to (a) enter and modify the data, (b)
organize the data, and then (c) retrieve and/or summarize the data.
Table 1. Database activities.
Activity
Manipulate Data
Organize Data
Retrieve and/or Summarize Data
Examples
change a customer’s order
enter a new student into the class roster
delete a student from the class roster
locate all past due invoices
sort a list of students by their last name
locate all students registered for a particular class
calculate total sales over the last year by zip code
list students in a class on the computer screen
create a disk file of students in the class
print a listing of students in the class
DBMS programs span a wide range of power and usability. A desktop DBMS
runs on a personal computer for only a single user. A server DBMS allows
simultaneous access to one set of data by multiple users at different computers.
The individual users or clients connect to the server computer that stores the
data that all users access. A small office or departmental server DBMS may
allow a dozen users to simultaneously access the database. The most powerful
server DBMS programs run on the largest mainframe computers and allow
2
Gerbing © 2013
I. Database Concepts
hundreds, thousands, or even tens of thousands of users, all simultaneously
modifying, processing, and retrieving data.
Some specific DBMS programs. The most widely installed DBMS at the
personal computer level is Microsoft Accessi, restricted to the Windows family
of operating systems. The most widely used DBMS that runs on both Windows
and Macintosh personal computers is Filemaker Pro. PC Magazineii (2012)
named FileMaker Pro 12 as Editor’s Choice for Best Personal Database. One
reason for this high rating is that Filemaker Pro is simpler to use than MS Access
but still delivers much of the power of the more sophisticated and
correspondingly more difficult to use DBMS. Filemaker also more easily
generalizes from one to hundreds of users, whereas Access is limited to only
several users at a time.
Large companies and organizations run their database systems on mainframe
computers. Examples include airline reservation systems, bank records, and
student records for large universities. The most widely installed large-scale
DBMS is Oracleiii. The large-scale DBMS with the second highest number of
installations is IBM’s DB2iv. Microsoft’s entry into the large-scale database field
is SQL Serverv, which is often integrated with a graphical user-interface
provided by MS Access (or similar product such as 4D). Versions of these largescale systems also run on smaller computers.
The Unix world offers several powerful commercial and open-source (free)
DBMS’s. Perhaps the most powerful and full-featured open-source database
that runs on Windows and Unix (and therefore Macintosh) is PostgreSQLvi.
MySQL is another popular open-source product, particularly for working with
data on web sites such as customer lists and products offered for salevii. MySQL
is more of a file manager than a true relational database, a kind of databaselight. SQLite is a very small free, open source database that is widely used in
many applications. For example, SQLite is at the core of many iPhone
applications, providing the structure for storing and organizing data.
History
An IBM researcher, Ted Codd, proposed the concept of a relational database in
1970. Based on Codd’s work, IBM developed the first versions of the standard
relational database programming language called SQL. IBM, however, already
sold a type of commercially successful database that preceded the relational
database, so the company did not innovate for fear of undermining their existing
product. Larry Ellison seized the opportunity in the early 1980s and founded
Oracle Corporation, which offered the first commercial relational database
Gerbing © 2013
3
Relational Database
based on the SQL language. Oracle’s database market preeminence continues
todayviii,ix.
Appearing in 1985 on the Macintosh, 4D was one of the first database programs
for a personal computer. 4D was also the very first database application to offer
a graphical user interface – when all other mainframe and PC databases were
limited to text interfaces.
Database Structure
Different DBMS’s organize data differently. The most common type of database
structure is the relational database.
A relational database stores all data in a series of related tables.
Some forms of databases (hierarchical and network) predate the relational
database, and a newer form of database structure, the object-oriented database,
is emerging. However, the vast majority of contemporary databases, including
those previously discussed such as Microsoft Access and 4D, follow the
relational model.
Tables
Entity Type
Fields
A table organizes
First
Addr1
Records
data by columns and Customer Last
Smith
John
22
NE
22nd
St
rows. The table in
Table
Jackson
Sally
33 NE 33rd St
Figure 1 contains
data for two
customers. An actual
Figure 1. Database table with two records (Customers).
Customer database
table may contain data for millions of customers.
All of the data in a table describes a particular type of person, event or object.
The type of person, event or object described by the table is the entity type.
Customer and Employee are types of persons that are common entity types in
the business world. Other common business entity types are Payment, Order
and Product.
Each row of a table, a record, contains the data for a specific instance of the
entity type, such as the data for a particular customer named John Smith as in
Figure 1. Each column of the table, a field, describes a particular property or
attribute of each instance of an entity type. Examples of fields include a
4
Gerbing © 2013
I. Database Concepts
customer’s last name, called Last in Figure 1. Many fields in a database store
alphanumeric characters, such as a person’s last name. Other fields represent
numbers such as invoice subtotal, dates such as the date of issuing an invoice,
or a simple yes or no.
Ideally all data in a table describes characteristics of a single thing or event.
Further, each field in the table should describe a single, specific characteristic of
that thing or event.
A guiding principle of database design is that each field should
describe as small amount of information as possible.
For a Customer table, instead of a single field called Name, include two fields,
LastName and FirstName. If the situation requires more information about a
person’s name, then include four fields: Salutation, LastName, FirstName, and
MidName. Always store each distinct, small piece of information in a separate
field.
Multiple pieces of information jammed together into a single field complicate
the processing of the distinct pieces of information. If a name field stores name
information as "John Smith", then sorting people by their last names would be
difficult because each last name is embedded within a larger field. Instead of
just directly locating the field for the last name and sorting, the computer would
have to find the generic name field, search within the field for a space, then
identify the last name as the information after the space to the end of the field,
store that extracted information somewhere, and then sort on that extracted
information.
A relational database could consist of only a single table, called a flat-file
database, but a single table is typically far from optimal. The next section
explores some of the problems that develop when a single table stores
information for multiple objects or events.
Limitations of Representing Disparate Data Types with a Single Table
To illustrate the need for multiple tables, consider a veterinarian’s database that
records information regarding customers and pets. In addition to storing all the
usual information about a customer – name, address, and phone number, etc. –
the veterinarian also records information about each of the customer’s pets.
How many pets can a customer have? The answer is many; the specific number
varies from customer to customer. One customer may have only a single cat,
another customer may have one dog and one cat, and another customer may
have four birds.
Gerbing © 2013
5
Relational Database
How can the varied information regarding the number and types of pets be
recorded in a flat-file database? How can customer information and pet
information all be jammed together into a single table? There are two
alternatives, both undesirable.
Attempted Solution #1: Repeating groups. One attempted solution for storing
customer and pet information within a single table places the pet information at
the end of each customer record. Figure 2 illustrates this situation. Each pet’s
name, species, and birth date for a single customer are recorded at the end of
that customer’s record.
Last
First
Addr1
Nm1
Smith
John
22 NE 22nd St Spot
Jackson
Sally
33 NE 33rd St
Type1 BrthDt1
Dog
4/18/97
Fluffy Cat
9/15/99
Repeating Group 1
Nm2
Type2 BrthDt2
Muffy Cat
9/15/99
Repeating Group 2
Figure 2. Problem: Forcing information from two entities, customers
and pets, into a single table with repeating groups.
Multiple displays of the same information (Pet Type and Birthdate) for different
objects (Pets) on the same record (Customer) are a repeating group. One
problem with repeating groups is that different people require differing numbers
of fields to describe their situation. A related problem is that the number of
repeating groups limits the number of pets for a customer. Providing a large
number of pets to accommodate the customer with the most pets does not
preclude the possibility that a new customer may still have even more pets. At
the same time, a large number of pet fields leaves most pet fields in the database
empty as the majority of customers have only a single pet or two. Another
problem of using repeating groups is that adding an additional set of fields to
describe each pet is unwieldy and difficult to update and maintain. For
example, to add the date of the last vaccination for each pet would require
adding a field for each repeating group on each Customer record.
Attempted
solution #2:
Redundant
data. A
second
potential
solution for
6
Multiple Rows
of Redundant
Data
Last
First
Addr1
Nm1
Smith
John
22 NE 22nd St Spot
Jackson
Sally
Jackson
Sally
Type1 BrthDt1
Dog
4/18/97
33 NE 33rd St
Fluffy Cat
9/15/99
33 NE 33rd St
Muffy Cat
9/15/99
Figure 3. Problem: Forcing information from two entities, customers and
pets, into a single table with multiple rows with redundant
information.
Gerbing © 2013
I. Database Concepts
storing information for multiple pets in the Customer table creates a new record
for each pet. Figure 3 illustrates the situation in which Sally Jackson owns two
cats, so two Sally Jackson records appear in the Customer table. The customer
information for Sally Jackson repeats for each of the two records, an example of
redundant data.
One problem with redundant data is wasted space. This example shows just
three fields for a customer, but there may be tens of such fields in an actual
database. All of this information must be copied for every multiple record. Of
greater concern is the difficulty of updating repeatedly copied information. If a
customer’s address changes, all records with that address must also change. If
only some of these records are modified, the same customer will have multiple
addresses in the database.
One-to-Many Relationships across Multiple Tables
Normalization. The solution to the related problems of repeating groups and
redundant data is data normalization. The formal definition of normalization
involves several rulesx. The following idea captures much (not all) of the
meaning of these rules.
Normalization avoids redundant data by placing data that
describes a specific entity type only in the table for that entity type.
The design goal of normalization for constructing a relational database usually
results with data placed in multiple tables. Simply put, customer data goes into
the customer table and pet data goes into the pet table. In place of one table
vainly attempting to blend disparate types of information, reduce the different
types of information down into their basic types – customers and pets in this
example – so that each type of information resides in its own table.
Key fields. A relational database typically stores many distinct types of
information. Only one kind of information is stored in a table, so a database
typically contains many tables.
Most relational databases consist of several or even tens of related
tables.
What is needed is a way to relate entries in one table to the corresponding
entries in a related table. Referring to Figures 2 and 3, Sally Jackson has two
cats, so if pets are placed into their own table, then somehow the data
organization need indicate which two cats Sally Jackson owns.
Gerbing © 2013
7
Relational Database
Information across multiple tables can be related if there is a unique identifier
for every record. The unique information that describes each record, a single
field or a combination of fields, is called the primary key. The safest and one of
the easiest ways to represent a primary key creates a single field solely for the
purpose of defining the primary key. For example, the DBMS can generate a
customer number that uniquely identifies each customer. For a system that can
store up to 10,000 customers, a customer number might be of the form C0002.
The name of the field could be CustomerID or CustID. Similarly, a Pet ID such
as P0008 could be stored in a field called PetID.
To identify customer ownership of pets, simply add a field to the Pet table that
duplicates the customer’s primary key. The only duplicated information in the
properly normalized Customer and Pet tables is the customer’s primary key. As
before, each Pet record has a primary key such as P0008 that uniquely identifies
a specific pet. The new idea presented here is that each Pet record also
contains a copy of a primary key from the Customer table that uniquely
identifies the pet’s owner. This key field is the link to the customer information
that is foreign to the information stored in the Pet table. The copy of a primary
key that links a record to a record in another table is called a foreign key.
The foreign key in a relational database is the glue that binds
related information together from different tables.
The foreign key field for the Pet table in Figure 4 is named ptrCustID. The "ptr"
in the name is a useful convention that abbreviates "pointer". The field
ptrCustID points to the primary key field CustID in the Customer table.
Primary Key
One Table
Customer
CustID
Last
First
Addr1
C0001
Smith
John
22 NE 22nd St
C0002
Jackson
Sally
33 NE 33rd St
Many Table
Primary
Key
Foreign
Key
PetID
ptrCustID
Species
Name
P0001
C0001
Dog
Fido
P0002
C0002
Cat
Muffy
P0003
C0002
Cat
Buffy
Pet
One-tables and many-tables. The need for multiple tables stems, in practice,
from the fact that one customer could have one, several, or a whole lot of pets.
Accordingly, several different pet records in the Pet table could all have the
same value of the foreign key. In Figure 4, John owns one pet, a Dog, identified
8
Gerbing © 2013
I. Database Concepts
by primary key value P0001. Sally owns two cats, identified, respectively, by
primary keys P0002 and P0003. Both of Sally’s cats have the same foreign key
field, C0002, so both pets belong to the same customer, Sally. Although each
customer can own many pets, each pet is owned by only one customer.
In this example the Customer table is called the one-table and the Pet table is
called the many-table. The one-table contains the primary key field and the
related many-table contains the corresponding foreign key field. One record in
the one-table could potentially relate to many records in the many-table, but
each record in the many-table points to exactly one record in the one-table.
To relate a one-table to a many-table, match each set of one or
more identical values of the foreign key in the many-table with the
corresponding single value of the primary key in the one-table.
For example, in Figure 4, the value of the foreign key C0002, occurs twice. The
two corresponding Pet records relate to the single record in the Customer table
with the same primary key value, Sally’s record. A one-table and the related
many-table together define a one-to-many relationship.
This relationship of the type between customers and pets strikes at the heart of
the relational database. The tables and defining fields, plus the one-to-many
relations among the tables, define the database structure.
The ability to translate aspects of real life into the language of oneto-many relationships is a primary skill in the construction of a
relational database.
The concept of a one-to-many relationship is pervasive and fundamental. Table
2 lists more examples.
Gerbing © 2013
9
Relational Database
Table 2. Examples of One-to-Many Relationships
One ___ can have Many ___
Each instance of Many points to only One
One customer owns many animals
Each animal has only one owner
One animal gets many vaccinations
Each vaccination is given to only one animal
One invoice includes many invoice lines
Each invoice line appears on only one invoice
One Product appears on many invoice
lines
Each invoice line has only one Product
One musician produces many albums
Each album is produced by only a single musician
One company hires many employees
Each employee works only for a single company
One zip code contains many customers
Each customer has only a single zip code
One customer makes many orders
Each order goes only to a single customer
One employee invests in many IRAs
Each IRA is owned by only a single employee
Entity relationship diagram. A one-to-many relationship can be illustrated with a
diagram.
An entity relationship diagram (ERD) represents each table with a
rectangle, each relationship with a line connecting two rectangles,
and the end of line adorned with a | for 1 and a < or > for many.
For example, represent the relation
One Customer can have many Pets and each Pet belongs to a single
Customer
with the ERD:
Customers
Pets
Some ERD’s are drawn with the convention that the vertical bar is replaced with
a 1 and the < or > is replaced with an M for many. The ERD provides
convenient shorthand for sketching the entire database structure.
The one-to-many
relationship pertains
to the two tables only
with respect to each
other. In one relationship a single table may be a one-table and in another
relationship the same table may be a many-table. For example, one customer
may own many pets and each pet is owned by one customer, but one pet may
have many vaccinations and each vaccination is given to only one pet. In this
example the Pet table is a many-table with respect to customers and a one-table
with respect to vaccinations.
10
Gerbing © 2013
I. Database Concepts
Automatic lookup. A relational DBMS organizes the data according to the
specified one-to-many relationships. A one-to-many relationship establishes an
automatic lookup, which automatically links the information in the related
tables throughout the operation of the database.
The user can automatically locate, select, and display values in one
related table while working in the other related table.
To the person who enters and modifies data in the database, the user,
information stored in multiple related tables appears on-screen as a single unit.
Information in the related tables is automatically made available to the database
user.
For example, the receptionist in the veterinarian’s office has a screen display for
entering a customer’s name. Enter the customer’s name and a properly
designed DBMS automatically displays not only all the customer’s information,
but also all the relevant records from the related many-table, a list of the
customer’s pets. Moving to the list of pets on the screen display, the
receptionist can click a button that indicates the desire to enter the information
for a new pet. Behind the scenes, the DBMS first creates a new record in the
related many-table, the Pet table, and then automatically creates a value for the
Pet primary key and copies the Customer primary key field into the Pet foreign
key field. Spared any needed knowledge of these technical details, the
receptionist sees a new, blank line on the screen with spaces for entering the
needed information such as the pet’s name, species, and birthdate.
Referential integrity. The DBMS also maintains a set of rules, some integrity
constraints called referential integrity, to foster the validity of the primary and
foreign keys. The three basic rules of referential integrity follow.
1. For a many-table, each value of the foreign key must match a primary key
value in the corresponding one-table.
Ex. The Customer ID for Fido in the pet table matches John’s Customer ID in the customer table.
2. If a primary key field is changed, then all foreign key fields in all related
many-tables should change accordingly.
Ex. If John’s Customer ID changes, then the corresponding values of this ID for all of John’s pets
in the pet table must also change.
3. If related many records exist then the corresponding record in the onetable cannot be deleted first.
Ex. If John has some pets, John’s customer information cannot be deleted first.
Usually these rules can be invoked so that the DBMS automatically applies
these rules in the background. For example, when a Customer’s ID is changed,
Gerbing © 2013
11
Relational Database
the usually automatic application of referential integrity changes all related
values of the corresponding foreign keys.
Given a database structure, data can be entered. Next we examine the means
to enter and display the data on-screen in a database.
Forms
Tables store the data. Forms provide the format for displaying the data from one
or more tables. A form may format the display of data on-screen for display
only or also for entering and modifying data. A form may also format the
display of data on the printed page.
Each form is attached to a specific table. A single table, however, may have a
dozen or more forms. Each form provides a specific view of all or some of the
data for a single record or a list records in the table. A form can also display
data for related tables as well.
Each table in a database generally has at least two forms. A table’s list or output
form lists one record per line. When viewed through a list form, many records
can be displayed on the screen at the same time or can be printed onto a single
page.
Figure 4.
Information from two records displayed with a list form.
A table’s detail or input form displays one record at a time, usually filling much
of or all of the screen or the printed page with information from a single record.
Most list forms are set such that double-clicking on a line of information on the
screen opens up the corresponding detail form for displaying that record.
Double-clicking on the first record displayed with a list form on the screen in
Figure 4 reveals the screen display in Figure 5 of the corresponding detail form.
12
Gerbing © 2013
I. Database Concepts
Figure 5. A detail form.
Different forms for a single table are typically used in different contexts to enter,
modify, or print data from that table. One Invoice form, for example, displays
the invoice on screen and another Invoice form formats the printing of the
invoice. Or, one detail form may display, to a user with sufficient privileges and
password, all customer information, including confidential information such as
a credit card number. Another detail form may exist, available to all users of
the database, which only displays part of the customer information, avoiding the
confidential data.
Database Analysis
A primary purpose of a database is to selectively retrieve and analyze the stored
data. Some analyses require extra programming, such as generating invoices.
Two general forms of analysis, queries and reports, are available to all
databases.
Gerbing © 2013
13
Relational Database
Queries
Searching and sorting information is a fundamental purpose of a database.
Which customers live in the 97201 zip code? Which orders were entered on
2/21, listed by size of the order? How many widgets were sold during the last
month? Which accounts are overdue, listed by the amount overdue? A
database query answers these kinds of questions.
A database query retrieves records from the database according to
specified criteria.
For example, the records displayed by the list form in Figure 4 (p. 12) display
the results of a query for customers with the last name of Gerbing. Indeed,
obtaining the type of information provided by a database query is one of the
primary purposes for maintaining a database.
An index specifies the exact location of where the information is located on the
hard disk. Indexing a field greatly speeds up searching for information in a
specific field, such as a LastName of Rumbinsky. For example, if LastName is
an indexed field, an ordinary PC can search through tens of thousands of
Customer records and retrieve all customers with a last name of Rumbinsky in
less than second. If the field is not indexed, this query could take much longer.
Without the index the DBMS would provide a more inefficient sequential
search. A sequential search begins with the first customer record, then looks at
the next customer record, and so on, until a record with the LastName of
Rumbinsky is located. If the LastName filed is indexed, then, just like a person
looking up the page number of a concept in an index of a book, the DBMS can
directly identify the physical location of the disk that contains the Rumbinsky
customer information.
Indexing greatly speeds up a query, such as querying on a value of LastName,
so why not index all database fields? The answer is that indexing is not without
cost. The index must be created and perpetually updated as the database is
modified. In addition, the index requires space for storing the location of all the
indexed records. In practice, only those fields searched on a regular basis are
indexed. Indexed fields include all primary and foreign key fields, and other
relevant fields such as LastName and Zip. Fields such as FirstName, or street
Address are rarely indexed.
Reports
A primary reason for storing data in a database is for subsequent processing and
analysis. Sometimes this information can be directly obtained, such as running
14
Gerbing © 2013
I. Database Concepts
a query that identifies and lists all Customers named Smith. The report
capability of a database takes this display one step further, offering further
analysis beyond that provided by a query.
The database report is a readable, formatted display of data and
analysis optimized for printing.
The report editor of a DBMS allows for the page numbering, headers and
footers, sorting, calculations, and other features that facilitate the printed display
of information.
The query selects the database records of interest. The report specifies the
analysis to perform on the selected data as well as the format for printing the
results. From a selection of all employees, a report could list all the company’s
departments, and then list each employee and his or her salary within each
department. At the bottom of the listing for each department, the report could
list the average salary and the total salary of each department. The end of the
report could display the average and total salaries for the entire company.
The accompanying report analyzes invoices. This report lists all invoices by zip
code for three zip codes: 18911, 18913, and 18914. The zip codes are sorted
in ascending order. Customers are sorted by last name within each zip code.
The report “breaks” at the listing of each new zip code. A break area on a
report separates levels of different groups, and displays calculations such as
totals and averages. The report displays the sum of all the subtotals for all of the
invoices within each zip code. One break area corresponds to the first sorted
field, a second break area corresponds to the second sorted field, and so on.
The number of break areas specifies the number of levels. The Grand total is at
the top level, the break area at the end of the report. The Grand total displays
calculations based on all of the data in the
report. The remaining break areas are set
according to the specified sort order. The next
break area, one level down after the Grand
total break area, corresponds to the Zip field.
Further, the report calculates the sum of all
the invoice subtotals for each zip code as well
as the grand subtotal, the total sum of all
subtotals for all invoices processed in the
report. The total subtotal for zip code 18911
is $211, for zip code 18913 the total subtotal
Gerbing © 2013
15
Relational Database
is $258, and the largest subtotal of $1465 is for zip code 18914. The sum of all
the subtotals calculated for all of the invoices in the report is $1934. All
numeric values were rounded to the nearest dollar for this report.
Example: The Invoicing Database
A business sells its products – things, information, or services – to its customers.
The order is taken with an on-screen order form and then the customer is billed
with a printed invoice. Both the on-screen order form and the corresponding
invoice consist of customer information, an invoice number and date, an order
line for each product ordered, including quantity and price, and the total
amount owed, including taxes and shipping. A DBMS generates and prints
invoices.
Invoicing Structure
An invoicing DBMS consists of at least the following four basic tables shown
below and the corresponding one-to-many relationships:
1. Each Customer can place many Orders, and each Order is for a single
customer.
2. Each Order can contain many OrderLines, and each OrderLine appears on a
single invoice.
3. Each Product can appear on many OrderLines over all existing orders, and
each OrderLine contains only a single Product.
The database structure revealed by the ERD for this database follows. The
structure specifies four tables and three one-to-many relationships. Note that
the OrderLine table contains all the order lines for all orders entered into the
Customers
Orders
OrderLines
Parts
database.
16
Gerbing © 2013
I. Database Concepts
Data from related one and many-tables can simultaneously appear on the same
form, on-screen or
printed. For example,
information about an
Invoice (one-table) and
the related lines of
information for each
product ordered (related
many-table) could be
viewed on the same
form. The information
from a one-table on the
form defines a detail
form called the main
form. The
corresponding
information from the
related many-table
defines a list form called
the subform.
A relational database
system represents an
invoice as a main form
that contains an
OrderLine subform.
An invoice is not stored in the database, but is instead a form
constructed from the data stored in at least four different database
tables.
There is no entity called an invoice that is stored as an invoice in the database.
The DBMS constructs each invoice from the information in these different
tables. The invoice per se exists only as a screen display or a print out.
The main form of the invoice is based on the Order table. This form consists of
information from the Order table, the date and invoice number (OrderID), as
well as information from the corresponding Customer table, such as name and
Gerbing © 2013
17
Relational Database
address. The lines of information, one line for each product ordered, define the
subform on the invoice. Each line of the subform is an OrderLine, which
includes information from the Products table.
The retrieval of information in related tables is a key operating feature of a
relational database. Given a record in a many-table, the DBMS can search for
and retrieve the related information in the corresponding one-table.
When using an on-screen form, entering information into a foreign
key field in the many-table results in the automatic lookup of the
related data in the corresponding one-table.
For example, entering a Customer ID into the ptrCustID foreign key field on the
Order form automatically retrieves and then displays the related customer
information. Entering C34795 in the ptrCustID field on the on-screen form
leads to the display of the related customer’s name and address, as shown in the
accompanying figure.
OrderLine subform
A second example of automatic lookup in an invoicing database occurs when
entering information into the OrderLine table. The OrderLine displayed on the
Invoice contains the ProductID, the Description, Price, Quantity, and Total for
that line. However, not all of the information displayed on an order line is from
the OrderLine table. The Product table stores Description and Price, as
illustrated in the following figure. After Product ID is entered into the
ptrProductID field on an OrderLine, the DBMS automatically finds and displays
the Product’s Description and Price.
18
Gerbing © 2013
I. Database Concepts
The fields on this OrderLine subform illustrate three distinct sources of
information for the display of information from the fields of a database:
1. The user directly enters the values for prtProductID and Qty from the
keyboard.
2. The DBMS automatically looks up (retrieves) the values for the
Description and Price fields from the corresponding Products (one)
table.
3. The DBMS calculates the value for the LineTotal field from the existing
values of other fields, Qty and Price.
The information in Steps 2 and 3 automatically appears after the user enters the
information in Step 1.
Database Design and Use
Overview of database design. Constructing a DBMS follows a series of related
steps.
a. Define the database structure – tables, fields, and relationships.
b. Construct input and output forms to enter and display data.
c. Add code to perform specific operations, such as calculating the
LineTotal of an Invoice line by multiplying the Price of the product with
the Quantity ordered.
d. Over a period of days, months, or years – modify, revise, and expand the
structure, forms, and code to fix errors and increase the power of the
database.
Gerbing © 2013
19
Relational Database
Follow these steps to develop virtually any database.
Designers vs. users. After the initial implementation in Steps a-c above, the
revisions in Step d require switching back and forth between designing and
using the database.
The designers design the database structure and the users process
the data entered and stored according to that structure.
Every DBMS provides both a means for designing the database structure – the
tables, forms, and code – and a means for actually using the database in terms
of entering, processing and retrieving data.
Database objects. The primary database objects and activities include Tables,
Forms, Queries and Reports. Tables and forms provide the means for the
database user to enter and modify data. Queries and reports are essential
database user activities for gathering and analyzing information.
Object
Tables
Forms
Context
Data
Data
Queries
Analysis
Reports
Analysis
Action on Object
Data Definition: Structure and store data.
Data Entry and Display: The window for
entering data into the database and for
displaying data.
Locate specific records that meet a
criterion.
Generate written reports that summarize
relevant information.
Example
Store data in a Customer table.
Enter new customer data into the
database or change existing data
for existing customers.
Find all customers with a specific zip
code.
Print a report that lists the names
and addresses of all customers
grouped by zip code.
A general note about working within any programming environment.
Close windows instead of minimizing. When developing an application such a
database, many different windows must be opened at various times. When
finished working with a window, it is generally safer to close rather than
minimize the window.
Caution → Minimizing a window leaves the window open, which may
conflict with the information stored in other windows.
Closing each window when finished modifying the contents of that window
prevents these unwelcome interactions.
20
Gerbing © 2013
IV. Addendums
References
Further Reading
Relational Database Design Clearly Explained
by Jan L. Harrington (2nd ed)
Holt Rinehart & Winston (1997); ISBN: 0030315883
Database Design for Mere Mortals:
A Hands-On Guide to Relational Database Design
by Michael J. Hernandez
Addison-Wesley Pub Co (1997); ISBN: 0201694719
Data Modeling Essentials: Analysis, Design, and Innovation
by Graeme C. Simsion
The Coriolis Group (1996); ISBN: 1850328773
Database Modeling & Design: The Fundamental Principles
by Toby J. Teorey
Morgan Kaufmann Publishers (1994); ISBN: 1558602941
Endnotes
i
www.microsoft.com/office/access
http://www.pcmag.com/article2/0,1759,1818968,00.asp
iii
www.oracle.com
iv
www.ibm.com/db2
v
www.microsoft.com/sql
vi
www.postgresql.org
vii
www.mysql.org
viii
wwwinfo.cern.ch/db/aboutdbs/history
ix
www.nap.edu/readingroom/books/far/ch6.html
x
www.devshed.com/Server_Side/MySQL/Normal
ii
Gerbing © 2013
21