Download 2.0 The Background of Database Systems

yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DBase wikipedia , lookup

Tandem Computers wikipedia , lookup

Global serializability wikipedia , lookup

Commitment ordering wikipedia , lookup

Microsoft Access wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Serializability wikipedia , lookup

IMDb wikipedia , lookup

Oracle Database wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Ingres (database) wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Functional Database Model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Versant Object Database wikipedia , lookup

Relational model wikipedia , lookup

Database wikipedia , lookup

Concurrency control wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

ContactPoint wikipedia , lookup

ICT Standards and Guidelines
Segment 104
Database Systems
Main Document
(Version 2.0)
The Office of the Minister of State for Administrative Reform (OMSAR) provides the
contents of the ICT Standards and Guidelines documents, including any component or
part thereof, submission, segment, form or specification, on an 'as-is' basis without
additional representation or warranty, either expressed or implied. OMSAR does not
accept responsibility and will not be liable for any use or misuse, decision, modification,
conduct, procurement or any other activity of any nature whatsoever undertaken by any
individual, party or entity in relation to or in reliance upon the ICT Standards and
Guidelines or any part thereof. Use of or reliance upon the ICT Standards and Guidelines
is, will be and will remain the responsibility of the using or relying individual, party or
The ICT Standards and Guidelines are works in progress and are constantly being
updated. The documentation should be revisited regularly to have access to the most
recent versions.
The last date of update for this document was June 2003.
Table of Contents - Database Systems
Executive Summary for Database Systems ............................................... 1
The Background of Database Systems ...................................................... 2
The Scope of Database Systems ......................................................... 2
The Benefits of Standardization ........................................................... 2
Policies to Follow for Database Systems ............................................... 3
Risks Resulting from the Standardization Activities ................................ 3
Related Documents ........................................................................... 3
How to Use This Document? ............................................................... 3
Related Terms and Acronyms ............................................................. 4
Related Segments and Cross References .............................................. 4
Related International Standards .......................................................... 5
2.10 All Segments in the ICT Standards and Guidelines ................................. 5
Roles and Responsibilities ........................................................................ 6
Selecting Database Systems Components ................................................ 7
Selecting Database Management Systems ............................................ 8
Selecting the Interface Layer .............................................................. 9
4.2.1 Open Database Connectivity (ODBC) ......................................... 9
4.2.2 Implementing Multi-Tier Architecture ......................................... 9
Mandatory Conditions for Database Management Systems .................... 11
Condition 1: Arabic Support .............................................................. 11
Condition 2: ODBC Support .............................................................. 11
Condition 3: Integrity Constraints ..................................................... 11
Condition 4: Transaction Management ............................................... 12
Condition 5: Multi Users Access ......................................................... 12
Condition 6: Recovery from System Crashes ...................................... 12
Condition 7: Access Control .............................................................. 13
Selecting the Architecture of a Database System ................................... 14
Centralized Databases ..................................................................... 14
Distributed Databases ...................................................................... 14
6.2.1 Data Fragmentation ............................................................... 15
6.2.2 Available Network ................................................................. 15
6.2.3 Transaction Management ....................................................... 15
6.2.4 Replication ........................................................................... 16
Parallel Databases ........................................................................... 16
Web Based Databases...................................................................... 17
Good Practices for Designing the Database ............................................ 19
Data Modeling................................................................................. 19
Schema Representation ................................................................... 19
The Use of CASE Tools ..................................................................... 20
Naming Conventions ........................................................................ 20
Decomposition and Normalization...................................................... 20
Using Database Triggers .................................................................. 21
Primary Key Selection ...................................................................... 22
Good Practices for Operating and Maintaining Database Systems .......... 23
Data Access and Security ................................................................. 23
Data Storage and Space Allocation .................................................... 23
8.2.1 Disk Storage ......................................................................... 23
8.2.2 Index Selection ..................................................................... 24
Tuning the Database........................................................................ 24
Auditing the Database ..................................................................... 25
Backup, Disaster Recovery and Contingency Planning .......................... 26
Documenting the Database System ................................................... 27
Roles and Responsibilities During Operation and Maintenance ............... 27
Training ......................................................................................... 31
Figures - Database Systems
How to Use the Database Systems Segment ....................................... 4
The Components of a Database System ............................................... 7
Architecture of a Web Based Database System ................................. 18
The Normalization Process ................................................................ 21
Database Systems Documentation Checklist ..................................... 27
Roles and Responsibilities ................................................................. 28
Human Resources Involved in the Operation .................................... 30
Executive Summary for Database Systems
The objective of this segment is to present guidelines that can be used during the
acquisition, development and maintenance of database systems.
The segment defines the basic components of a database system to be the database, the
Database Management System (DBMS), the interface layer and the software application.
Good practices for the design, operation, and maintenance of the database are
presented in the second part of the segment.
The segment establishes Relational Database Management Systems (RDBMS) as the
standard for database management systems and presents the mandatory conditions for
their selection.
For the interface layer, the segment suggests using the multi-tier architecture design to
decouple the data from the software application. Alternatively ODBC may be used as the
interface layer.
This segment proposes different architectures for database systems. The centralized,
distributed, parallel, and web-based designs are presented each with its strengths and
weaknesses and the concrete reasons for their selection. The segment recommends
using web enabled centralized databases unless there is a need for else.
The second part of the segment presents a set of good practices and guidelines.
Good practices for data modeling, schema representation, using CASE tools, data fields
naming conventions, third form normalization, using database triggers, and primary key
selection are presented for a good design of the database system.
In a later section, this segment presents good practices for the operation and
maintenance of the database systems, practices for data access and security, data
storage and space allocation as well as for tuning the database, auditing the database,
and training on its use.
The segment proposes a schema for dividing the roles and responsibilities for operating
and maintaining the database system among the Ministry or Agency staff concerned.
Finally, the segment presents a list of the documentation that the DBA must keep and
update about the database system.
A separate and comprehensive segment covers Software Applications while another one
covers the Evaluation and Selection Framework for ICT products and services. This
segment provides the input to these two segments and shall not deal with their content.
Both these segments can be downloaded from OMSAR's website for ICT Standards and
Guidelines at
Database Systems
Page 1
The Background of Database Systems
This segment is concerned with Database Systems. Database systems are information
technology solutions built around databases. The basic components of a database
system are databases, database management systems and software applications.
The objective of this segment is to present and discuss guidelines that can be used
during the acquisition, development and maintenance of Relational Database
Management Systems (RDBMS).
Topics to be covered are the selection, administration, organization, storage, security
and efficiency of database systems.
The guidelines are applicable to any database system whether the database system is
part of any internally developed application, one that is being developed by an external
vendor or one that is part of a Commercial Off the Shelf Software (COTS).
The Scope of Database Systems
The scope of database systems covered in this segment is limited to:
The rules governing the organization of the data within the database
Data storage
Data access and security
Backup, recovery and contingency planning
Basic database administration requirements
Human resources involvement and training
Selection criteria where applicable
Programming issues
Maintenance of databases
This segment is not concerned with
The software applications that manipulate the data stored in the database
according to the business rules of the organization hosting the data. These are
presented in the Software Applications segment.
Data Warehouses which are multi-dimensional repositories of data gathered from
multiple sources.
Data Mining which is the technique of finding relevant information from a large
volume of data such as a data warehouse
The Benefits of Standardization
The objective of this segment is to provide a common framework for the acquisition,
development and maintenance of database systems.
Database Systems make up the largest segment of business oriented software
applications. Whether choosing Commercial Off the Shelf ERP applications or developing
Database Systems
Page 2
a custom made application for a specific government ministry or agency, chances are
that a database is needed to store the information.
The requisitioning and the selection of future systems as well as the maintenance of
existing ones are a burden. If these database systems have no common framework and
no standard documentation, maintenance by itself will surely become a nightmare. Such
maintenance becomes a major issue after five years or so from the inception of such
systems. At that time, they would usually be exposed to disk crashes and database
performance deterioration.
Policies to Follow for Database Systems
The following policies are proposed:
Database systems should be used with the mandatory criteria presented in this
The Good Practices presented in this segment should be observed.
Standardized practices should be used as these would reduce training and
troubleshooting costs.
Risks Resulting from the Standardization Activities
When standardization is implemented, the following risks may arise:
The mandatory criteria are not observed while acquiring database systems
The recommended Good Practices are not observed
Incompetent persons are assigned roles related to the design, maintenance or
operation of database systems
Related Documents
One document is related to this segment and that is a Check List to be discussed in
Section 8.6.
How to Use This Document?
There is one main document. It defines database systems and defines selection criteria
for each of its components and for the architecture of the system as a whole.
The rest of the document presents good practices for the design of databases and for the
operation and maintenance of database systems.
Figure 1 depicts the road map to navigate through the segment.
Database Systems
Page 3
Select DBMS
(Section 4.1, 5.0)
Design the
Select Interface
Select the
(Section 4.2)
(Section 6.0)
Develop or
Acquire the
(Section 7.0)
Operate and
Maintain the
(Section 8.0)
Figure 1: How to Use the Database Systems Segment
Related Terms and Acronyms
Active X Database Objects
Application Programming Interface
Active Server Page
Computer Assisted Software Engineering
Commercial Off the Shelf Software
Create, Read, Update and Delete
Database Management System
Entity Relationship Diagram
Enterprise Resource Planning
Attribute or column (interchangeable terms that mean the same thing)
Open Database Connectivity
Object Role Modeling
Relational Database Management System
Row (interchangeable)
The description of data in terms of the data model
Structured Query Language
Unified Modeling Language
Related Segments and Cross References
The following segments have Standards and Guidelines that relate to this segment:
Database Systems
Operating Systems
Quality Management
Software Applications
Evaluation + Selection Framework
Information Integrity and Security
Page 4
Each page contains the main document and supplementary forms, templates and articles
for the specific subject.
Related International Standards
There are no related standards for database systems. However, the Structured Query
Language (SQL) is a standard language that may differ from one vendor to the other.
The following organization defines standards for SQL:
ANSI - American National Standard for Information Systems - Database Language
ISO Database Language SQL - Part 2: Foundation
All Segments in the ICT Standards and Guidelines
OMSAR's website for ICT Standards and Guidelines is found at
and it points to one page for each segment. The following pages will take you to the
home page for the three main project document and the 13 segments:
Global Policy Document
Cover Document for 13 segment
Legal Recommendations Framework
Hardware Systems
Database Systems
Operating Systems
Buildings, Rooms and Environment
Quality Management
Software Applications
Evaluation + Selection Framework
Information Integrity and Security
Data Definition and Exchange
Risk Management
Configuration Management
Each page contains the main document and supplementary forms, templates and articles
for the specific subject.
Database Systems
Page 5
Roles and Responsibilities
The Database Administrator (DBA) is responsible for the design, operation and
maintenance of the database and the DBMS. The DBA is usually a staff in the IT
Department of the Ministry or Agency. The IT Department Head or IT Manager is usually
responsible for the software application side of the database system.
When it comes to defining roles and responsibilities for the entire database system, other
staff from the IT department and from the Ministry or Agency is involved within limited
responsibilities. This is especially true when acquiring a database system or selecting the
appropriate architecture. The IT Manager and DBA are asked for their input but they
might or might not be the final decision makers.
Section 8.7 describes in details the roles and responsibilities of the staff involved in the
operation and maintenance of database systems.
Database Systems
Page 6
Selecting Database Systems Components
This section addresses features that should be looked for when selecting a database
system. Several components are discussed.
The basic components of a database system are databases, database management
systems and software applications. Figure 2 represents the components that define a
database system. The minimum configuration for a database system is depicted in solid
lines. The dotted lines are optional additional components.
Software Application
DBMS Interface
Database 1
Database 2
Database n
Figure 2: The Components of a Database System
Database Systems are collections of related data. The names, ages, topics of interests
of all children attending a particular school are all one collection. The makes, engine
numbers and colors of all cars driving on the roads of Lebanon can be considered as
another collection.
The software needed to control and maintain these collections of data is called a
Database Management System (DBMS).
Software Applications are special purpose programs that manipulate the data stored
in the databases through the DBMS according to the business rules and procedures of
the organization requesting the software application. The guidelines for selecting
software applications are discussed in a separate segment and will not be discussed in
Database Systems
Page 7
this segment. The Software Applications segment can be downloaded from OMSAR's web
site for ICT Standards and Guidelines
Selecting Database Management Systems
The following is a list showing the various database technologies available today:
Networked databases
Hierarchical databases
Relational databases
Object oriented databases
Even though some existing database systems still operate on hierarchical and network
databases, they are almost obsolete and are not used today.
Standard: This segment proposes the use Relational Database Management Systems
(RDBMS) as the database management system of choice.
Exception: Object Oriented database management system (ODBMS), may be used
instead of RDBMS, in the case where:
RDBMS cannot be used such as when the nature of the application is highly
complex such as in Artificial Intelligence (For example, designing expert
systems), modeling (CASE tools, routing, workflow), or engineering applications.
An object oriented programming language such as C++ or SmallTalk is used to
code the functionality of the desired system. In this case, the ODBMS is a natural
extension of the programming language.
It is worth noting that even though the term “relational” applies to the data model
underlying the structure of the database, the data model is not a selection criterion by
For example, the popular FoxPro product uses a relational database model allowing the
organization of data in rows and columns and allowing the definition of relations between
them. However, it uses its own command language and not SQL in order to create and
manipulate the operating system files that form the database.
Hence, Microsoft FoxPro™ systems are file processing systems. Permanent records are
stored in various files supported by the operating system and the application programs
extract records and add records to the appropriate files.
File processing systems have major disadvantages such as allowing data redundancy and
inconsistency between files. Moreover, they have difficulty accessing data and managing
concurrent multiple users. Finally, they have integrity problems, atomicity problems (i.e.,
transaction management) and security problems. An example of the latter is that the
database can be accessed through the operating system.
Therefore, Microsoft FoxPro™ and Microsoft Access™ based systems are file processing
systems and are not database management systems. They are excluded from use by this
Database Systems
Page 8
Selecting the Interface Layer
This layer is an optional component. If it is to be used, then the following must be
ODBC (Open Database Connectivity) or
Multi-tier architecture
These are presented in the next two sections.
4.2.1 Open Database Connectivity (ODBC)
Through ODBC, a single executable (Source code) can access different DBMSs without
recompilation. In Figure 2 shown earlier, the ODBC drivers are in place of the DBMS
Interface box.
Even though, the segment requires that the DBMS of choice support ODBC, (See Section
5.2), the use of the ODBC is not recommended unless needed. The reason is the
additional resource cost in performance and query time.
The ODBC solves portability problems when Multi-tier architecture is not implemented.
4.2.2 Implementing Multi-Tier Architecture
This segment emphasizes the importance of this architecture in the context of database
For database systems to be portable, reusable and easily maintained, all calls directed to
the database should be grouped in a DBMS interface class. This allows the separation of
calls to the database from the rest of the source code. The term “Class” refers to a
separately compiled modules making up libraries. Such modules are often referred to as
components and usually follow one of several industry standards such as COM or CORBA.
The importance of this separation is to be able reach data independence by decoupling
the calls to the database from the application code. Secondly, distributing the
deployment of the components on several servers results in an improvement of
For example, assume that a screen displays the personal data including the date of birth
of a citizen on a screen. Within the application, the following pseudo code appears:
In the DBMS_Interface class, library, package, etc.. The actual SQL (Structured Query
Language) or other calls are issued.
Such layering of the code would have reduced the impact of the Year 2000 problem, as
all references to retrieving and storing two-bytes representations of dates could have
been identified and corrected in the DBMS interface class as opposed to wandering from
program after program searching for references to dates.
Other benefits of data independence are:
Database Systems
Page 9
Decoupling the front end from the back end of the database system. An
example would be the use of Object Oriented applications with a relational
Portability: migrating either the front end to another language or changing the
Reusability: the ability to reuse all or parts of the system
Therefore, it is strongly recommended that multi-tier architecture be implemented for
the DBMS interface layer for database systems.
Database Systems
Page 10
Mandatory Conditions for Database Management Systems
Having established that a Relational Database Management System (RDBMS) should be
used, the minimum requirements that the RDBMS must perform without the use of
human interference or external software programs should be defined.
Note that these features are mandatory and that when using the Evaluation and
Selection Framework, they represent pass or fail conditions for the selection of RDBMSs.
Condition 1: Arabic Support
Any DBMS being purchased must comply with ANSI standards to support the national
language character sets including Arabic. The database must be able to store Arabic
characters, regardless of whether the operating system hosting it supports Arabic or not.
Condition 2: ODBC Support
ODBC is considered as the primary standard for open systems. Most major software
vendors support it. It allows a single executable (Source code) to access different DBMSs
without recompilation.
Applications using ODBC are independent of which DBMS is being used at the source
code level and at the executable level. In addition, using ODBC allows the application to
access more than one DBMS simultaneously. This independence is achieved by adding an
extra layer, a DBMS specific driver, between the application and the DBMS(s). The driver
intercepts the SQL call issued from the application (Specified by the ODBC API) and
translates it to a DBMS specific call. (Refer to the diagram in Section 3.0).
The architecture of ODBC has four components:
The application
The driver manager
DBMS-specific driver(s) and
The data source (i.e., the corresponding DBMS).
Condition 3: Integrity Constraints
An integrity constraint is a condition that is enforced automatically by the DBMS and
whose violation prevents the data from being stored in the database. The DBMS enforces
integrity constraints in that it only permits legal instances to be stored in the database.
The key constraint and the referential integrity constraints are identified as the two
minimum constraints that must be enforced by the DBMS.
Key constraint or unique constraint: Every record must have one unique
identifier called the primary key that has a unique value within the table or
collection. Primary keys can be concatenated, which means that the uniqueness
can be made up of one or more fields.
Referential integrity constraint or foreign key constraint: This constraint
asserts that a reference in one data item indeed leads to another data item. A
Database Systems
Page 11
foreign key is a field that is a primary key in another table. Referential integrity
consists of:
Not inserting a record if the value of the foreign key being inserted does
not match an existing record in another table with the primary key having
the same value,
Not deleting a record whose primary key is defined as a foreign key in
child records and
Not modifying the value of primary keys.
Most DBMS enforce other types of constraints having to do with the data content of the
field and usually called Check constraints. Examples are limiting the values of a field to
a list of values or to a range of values, validating dates and checking the format of the
data i.e., no alpha characters allowed in a numeric field, etc.
Condition 4: Transaction Management
The DBMS must have a way to differentiate between a simple command or SQL
statement and a set of commands or SQL statements that form one transaction. For
example, a money transfer transaction that transfers money from account A to account
B is not complete before the first account has been debited and the second account
Condition 5: Multi Users Access
The DBMS must allow simultaneous multiple users access to the database. Additionally,
a locking protocol must be available to ensure that concurrent execution of transactions
does not act on the same resource (Row or table).
The DBMS first locks every resource in shared or exclusive mode in order to be read or
written respectively by a transaction. These and additional locks can be further
manipulated programmatically through the application.
Condition 6: Recovery from System Crashes
Transactions can be interrupted before completion for a variety of reasons. Examples
System crashes
Operating system crashes
User session crashes or
Forced disconnection
The DBMS must ensure that the changes made by such incomplete transactions are
automatically removed from the database and without the intervention of the database
Likewise, the DBMS must ensure that changes made before the crash remain in the
database. So the DBMS must bring the database to a consistent state after a system
Database Systems
Page 12
crash by ensuring that the effects of all transactions that completed prior to the crash
are restored and that the effects of incomplete transactions are undone.
Condition 7: Access Control
Management of different security levels for accessing and/or manipulating the database
must be available within the DBMS. Furthermore, users with similar access rights are to
be grouped together, the DBMS should allow these groups or roles to be given the same
privileges. This section reviews the minimum expected from the DBMS in order to
provide access control to the database.
However, these security levels are tools that must be put to use within the framework of
a security plan or security policy before they begin to guarantee the security of the
database. Section 8.1 will describe the guidelines for putting security plans and
implementing them.
The DBMS must ensure that unauthenticated access of the database is not
permitted regardless of the network or Operating System security enforced.
Authentication means that the DBMS should always request a valid username and
password to any application, session and user accessing the database.
The security levels are formed by coupling privileges with database objects.
Database objects of interest are tables and views. Security at the column level is
not always available. This feature would be desirable to have in an RDBMS
The privileges of interest that can be granted on database objects are listed below
and are commonly referred to as the CRUD privileges.
insert rows
select rows and read data
update the contents of a row
delete rows
Additionally, the right to define REFERENCES means that foreign keys (in other
tables) that refer to the specified column or all columns can also be defined.
Other privileges such as creating, dropping and altering tables are usually given
to the system administrator or the user who owns the schema.
Privileges are assigned to individual users or to a group of users. The group of
users represents divisions in the real world, where people having the same role or
job within an organization perform similar job functions.
DBMSs allow discretionary access control which means that users with privileges on
objects may pass on these privileges to other users. Discretionary access control is less
secure than mandatory access control. It assigns security classes to database objects
and clearances to users. However, commercial DBMSs do not support mandatory access
control. Section 8.0 shows how the discretionary access control can be strengthened to
achieve more secure levels.
Database Systems
Page 13
Selecting the Architecture of a Database System
Choosing the database system architecture relies on two factors:
The underlying network on which the database system will run
The need to distribute data across multiple databases because of geographical or
administrative constraints
The following sections define different architectures. They provide the reader with the
criteria to be used when selecting the architecture of a database system. When selecting
databases, such criteria can be used in the Evaluation and Selection Framework
presented as a separate Standards and Guidelines Segment.
Centralized Databases
A centralized database system is a system that keeps the data in one single database at
one single location. In a centralized database system, a single machine called a database
server hosts the DBMS and the database.
Multiple users or client workstations can work simultaneously on a centralized database
system using the Client/Server configuration, or the Intranet configuration if
An underlying LAN (Local Area Network) is available (LANs can span one or few
adjacent buildings)
An underlying WAN (Wide Area Network) is available (WANs can span all
The client/server architecture is a very successful and popular one as it balances the
processing load between the client machine and the server machine.
The ongoing growth of Internet and intranet applications has refocused attention on
centralized databases. In such configuration, the bulk of the processing does not lie on
the client machine, but rather on the machine hosting the Application Server and the
database server machine.
The main disadvantage of centralized database systems is that of single point of failure.
When the database fails, work of all users is interrupted. However, when 24x7
operations are needed, there are solutions to minimize the risk of failure of the database
server such as the use of a cluster server. Also in the case where WANs are used, failure
of part of the network means the interruption of work at the remote location.
Therefore, centralized databases are easier to manage, maintain and control for security
purposes. They should be the selection of choice if there is no need for a more complex
Distributed Databases
The main difference between centralized and distributed database systems is that, in the
former, the data resides in one single location, whereas in the latter, the data resides in
several locations or on multiple servers at the same location.
Database Systems
Page 14
The distribution of the data across locations should be transparent to the user who
continues to use the software application interface from his/her computer.
Distributed database systems involve many complex issues such as transparency,
transaction management, optimization, data fragmentation and replication. Their design
requires a high level of sophistication and competence from the supplier and their
management requires an experienced Database Administrator.
The issues summarized below must be assessed during the selection process.
It is recommended that distributed architectures be used strictly on a per need basis
because of the complexity of their design and maintenance.
6.2.1 Data Fragmentation
In order to assess the need for a distributed database system, the required partitioning
of the data or fragmentation must first be studied.
Horizontal partitioning means that a record is stored at every location. For example,
every branch stores the records of its customers. In the broader sense, data of the
Lebanese government is horizontally partitioned, i.e., the records of people are stored in
their respective Muhafazat and the citizen performs all his needs in the government
agency of his Muhafazat.
Vertical partitioning means that the parts of the record are stored in different locations.
For example, the customer data is stored at the Customer Relation department in
Dbayeh, the loans data is available in the Business Expansion department in Baabda,
The distributed database can involve both horizontal and vertical partitioning. For
example, each branch identified above keeps the data of the accounts that are opened in
6.2.2 Available Network
The design of distributed database systems is strongly influenced by the type of
underlying WAN or LAN. Distributed database systems involving vertical partitioning can
run only on those networks that are connected continuously - at least during the hours
when the distributed database is operational.
Networks that are not continuously connected typically do not allow transactions across
sites, but may keep local copies of remote data and refresh the copies periodically. For
example, a nightly backup might be taken. For applications where consistency is not
critical, this is acceptable. This is also acceptable for systems involving horizontal
partitioning of the data.
6.2.3 Transaction Management
This is used when vertical partitioning is used and special techniques must be applied in
order to ensure that the transaction is applied in two different databases so as not to
cause inconsistency. This technique is called the two-phase commit.
Database Systems
Page 15
It is recommended that the DBMS vendor provide the distributed transaction
management software. The supplier should not attempt to write transaction
management code nor buy a third party product for such a purpose.
6.2.4 Replication
Replication is the process of synchronizing several copies of the same records or record
fragments located at different sites and is used to increase the availability of data and to
speed query evaluation.
The supplier must lay out a detailed Replication Plan including
The partitioning of the data and how to select data field names and key values so
as not to cause conflicts between sites
The timing of the replication (i.e., synchronous vs. asynchronous)
Resolution of potentially conflicting updates at different sites and ways for
detecting them
Note that suppliers feel that they can handle replication and especially an asynchronous
one (i.e., copying numerous records from one database to the other).
Unless such activities are labeled remote backups, it is recommended that the DBMS
vendor provide the replication software. The supplier should not attempt to write
replication code nor buy a third party product for such a purpose.
Parallel Databases
Parallel database systems make use of multiple processors such as cluster server that
host the DBMS. The use of multiple CPUs allows database system activities to be
speeded up, allowing faster response to transactions as well as more transactions per
Parallel database systems can be selected when a very high volume of transactions per
second is expected from the system or when more than 100 users are expected to log
into the system at a given time. Examples might be filing taxes online, renewing vehicles
registration, etc.
It is recommended that the DBMS vendor provide the software programs that ensure
that the DBMS take advantage of multiple processors. Under no circumstances should
the supplier write such code or buy a third party product for such a purpose.
Note that clustered servers are not exclusively specified for improving the performance
of the database system through parallel processing. Rather, they might be specified for
insuring a 24x7 availability of the database. In this case, DBMS vendors should also
provide software programs that ensure that the DBMS can switch automatically from one
node of the cluster server to the other in the case of node failure.
Database Systems
Page 16
Web Based Databases
With the advent of the web, the trend is towards using the internet and the intranet for
internal and external applications. It follows that all database systems be fully web
enabled or at least contain Web based components
The architecture of a web based database system involves the following software
Web Browser connecting the user to the Internet
Web Server receiving requests from a remote Web Client. The Web Server simply
retrieves the page defined by its URL and sends it to the Web Browser. Often, the
Web Browser has to execute a program in order to assemble a dynamic page. In
this case, the Web Browser has to access data in a DBMS
Application Servers are optional but are recommended for the Web architecture.
They facilitate the execution of programs and include security, session
management, etc…
Software application interfaces with the Application Server to provide the pages
DBMS interfaces with the Application Server to provide the data needed
The above is depicted in Figure 2. The Web Server is connected to the Internet and
handles all requests from remote Web Clients. The Web Server communicates with the
web enabled database system Application Server to retrieve the page requested. The
Application Server contains the web-enabled program that dynamically accesses the data
from the Database Server and the layout screen from the Application Server to send the
page requested. Application servers may be DBMS vendor specific.
Hosting the different software components on different physical server machines is
recommended. At a minimum, the DBMS and the database should always be hosted
together (Database server) separate from the Web or Application servers.
The field is still in motion so that a standard cannot be recommended at this point for
programming languages of software applications.
However, it is recommended that XML pages instead of plain HTML be generated by the
various Web based software applications. (XML is a document description standard that
allows the description of the content and structure of the document in addition to giving
display directives.)
Document Type Declarations (DTDs) are being developed for various application areas.
In very simplistic terms, the DTDs are templates or a relational database schema where
specific fields have placeholders. Already XML-Query languages are being developed and
commercial DBMS vendors can only catch the wagon to fully support the XML standard.
There are other issues concerning Web based database systems that do not fall under
the scope of this segment. For example, keyword search - which is the most common
kind of query on the Web today - is not suited for databases. In all databases, it is not
possible at the native level to search on the content of the field without knowing the field
name (or the object name). However, there are full context search engines that can
compile an index of content by searching all fields and report out a search result with %
accuracy of the hit.
Database Systems
Page 17
Web Browser
Web Server
Application Server
Internet Server
Database System
Database Server
Figure 3: Architecture of a Web Based Database System
Database Systems
Page 18
Good Practices for Designing the Database
Design is the process of translating the abstract users’ needs into a model that can be a
workable foundation for constructing the database system.
The following sections present some Good Practices to be followed in the design of
database systems.
Data Modeling
Data modeling is the process of defining the structure of the database. Mainly it is
concerned with laying out the data stores of interest (Objects, entities, or tables) and to
define the relations between them. The relations are the means of navigating between
one table and the other and so must be uncovered and accounted for in the design
phase in order to ensure proper execution of the application later during the
development phase. The data modeling activity ultimately results in generating a
schema, i.e., the organization of columns across tables and the definition of the relation
between them.
Several models do exist for the modeling of data: Entity Relationship (ER), Object Role
Modeling (ORM), ODMG Object Model, etc... Other methods exist such as Yourdon,
Merise and the UML that include data modeling.
The most popular data modeling tool is the Entity Relationship (ER) modeling. Most
database data modeling currently uses some variant of entity-relationship (ER)
modeling. Such models focus on the objects of interest and the business relations
between them such as to keep some link between the model and the real world.
Relations can be modeled with real world business rules and terminology, for example,
one customer orders many products, one customer may open many bank accounts,
many work orders are combined with many sales orders, every citizen must have a
unique identifier, etc... ERs are adaptable to object modeling. In this sense, objects can
be modeled as entities (persons, cars, bank accounts, etc...).
Finally, ER models have a special importance because several CASE tools are built
around them. CASE tools provide a way to generate a relational schema directly from
It is recommended that ER modeling be used. Furthermore, the ER diagram should be a
required part of the standard documentation of a database system.
Schema Representation
The database schema is the representation of how the data is organized within the
database and what the database objects available in the database are.
It is a recommendation of this segment to have the data dictionary as the official
representation of a database and to be a required part of the standard documentation of
any database system.
The data dictionary format and content will be discussed in the Data Exchange Segment.
Database Systems
Page 19
The Use of CASE Tools
CASE tools are Computer Assisted Software Engineering tools. These are now available
to generate both back ends (The database) and front ends (The software application) for
database systems.
The use of CASE tools is recommended because they:
Maximize the consistency between the design and the implementation
Minimize human error incurred by translating the design into source code
Shorten the time spent on the development of source code
Facilitate the implementation of change control procedures
Some CASE tools even integrate configuration management and allow the release of
multiple versions of the database. The designer focuses on implementing the change at
the highest level (the design level) and the CASE tool takes care of propagating the
change down to the physical database level. This is of outmost importance when new
releases occur after the start of operation of the database system. The possibility of
down time because of version upgrade is minimized.
Other CASE tools can be used for modeling ERDs and would hence be able to generate
scripts for creating databases for different platforms.
It is recommended to use CASE tools to generate both the back end (the database) and
front end (the software application) of database systems. At a minimum, CASE tools
should generate the back end, i.e., the database.
Naming Conventions
Historically, names could be eight characters long and programmers used abbreviations
and coded names to represent the data content of the field. Furthermore, some software
engineers have taken the practice of coding the name of the table into the name of the
fields belonging to that table.
It is recommended that field names be as natural as possible. This is important because
the table name can be queried from the database data dictionary at any time; it is also
recommended that the name of the table not be included in the field name.
This recommendation leads to a better readability and a more universal understanding of
the data content.
The most suitable practice would be to issue internal standards for the Naming
Convention of database elements.
Decomposition and Normalization
Normalization is the process of organizing the data at hand into tables. Normalization is
at the heart of the relational model theory. It states that any subset of the data can be
accessed if the database is in the Third Normal Form. Normalization insures that a
query can be written to retrieve any information from the database.
Database Systems
Page 20
The process of normalization applies to relational databases only. Figure 4 depicts this
As a first step, all the data fields necessary to conduct a business are organized into one
long record. The resulting table is called the Un-normalized Form.
In order to move to the First Normal Form, repeating groups must be removed and put
into different tables. The primary key must be identified at this stage too.
Unnormalized User
Remove Repeating
First Normal Form
Remove Partial
Second Normal Form
Remove Transitive
Third Normal Form
Figure 4: The Normalization Process
In order to move to the Second Normal Form, partial dependencies must be removed in
order that all non-key attributes become fully dependent on the primary key even if the
primary key is concatenated (made up of many fields).
In order to move to the Third Normal Form, transitive dependencies must be removed.
Transitive dependencies occur when non-key attributes are dependent on a non-key
Using Database Triggers
A database trigger is a procedure that is automatically invoked by the DBMS in response
to a change event against the database. A database that has triggers attached to it is
called an active database.
The change events and the timing of the firing are specified within the trigger code and
they are:
Before Insert
After Insert
Before Update
After Update
Before Delete and
After Delete
The importance of triggers emanates from the fact that they are fired according to their
set-up regardless of the source that is requesting the change. For example, when
database triggers are fired in response to an Update operation, the trigger code is
executed whether application 1, application 2, or the database administrator through the
SQL interface of the DBMS is performing the operation.
Database Systems
Page 21
The frequency of firing the database trigger can also be controlled from within the code
of the trigger. The trigger can fire once for each row being affected by the operation
(Row-level trigger) or it can fire only once regardless of the number of rows being
affected by the operation (Statement-level trigger).
While database triggers are especially suited for auditing and statistical gathering, there
is no need or no way to write a standard for limiting the scope of their use by application
programmers. The standard however recommends that application suppliers and
database administrators document database triggers because the maintenance of active
databases is very difficult. The documentation is necessary because maintenance
personnel must trace the error condition to either application code or database trigger
action. Finally, some conditions may cause database triggers to fire in a chain reaction
and this needs to be documented.
The database trigger documentation should include the name of the trigger, the firing
event or operation, the frequency of the firing, the name of the source table, the name
of any table(s) affected by the trigger, a short description of the actions of the trigger.
Primary Key Selection
Each record (row) must be uniquely identified within the table.
Primary keys cannot be modified or updated throughout the life of the database. If the
primary key must be updated because it was not selected properly, the entire row must
be deleted and recreated. If the record has dependent records or children in other
tables, which have children in other tables, deleting the record is an affair by itself. So
encoding meanings in the value of the primary key is a dangerous practice that must be
discontinued. For example the following candidate for identifying a citizen is not valid:
the first three characters encode the city of residence, the next three digits encode the
religion and the last seven digits are the unique home telephone number. All pieces of
this candidate key are subject to change.
The practice of encoding meaning into the primary key value is inherited from the paper
bureaucracy and from the tight memory years. Now that powerful database systems are
available, it is possible to search on any combination of fields and to create indexes on
any combination of fields.
Primary keys must be able to be quickly and easily generated and should not depend on
other data for their generation. Because a record cannot be inserted in the database
without a primary key, any violation in the generation of the primary key can result in
loosing the data and not being able to insert the record. Therefore, the use of unique,
automatically generated serial numbers is recommended for primary keys.
Database Systems
Page 22
Good Practices for Operating and Maintaining Database
The following sections present some good practices to be followed in the operation and
maintenance of database systems.
Data Access and Security
As seen in section 5.7, the recommended standard requires that any DBMS in use by the
government must have the ability to assign different security levels to different users or
groups of users (roles).
However, with the best intentions, this ability does not enforce security. A clear and
consistent security policy or security plan must be developed around these abilities. A
security plan must include the following:
Identification of the objects that must be protected: i.e., tables, views and
columns and the reason for the protection.
Identification of the privileges associated with the protected objects (CRUD)
Identification of the users or group of users who get access to the protected
objects and to all objects.
Definition of the procedures to be followed in the handling of protected objects.
For example this table needs a journal, updating a field in this table causes the
before image to be dumped in a history table, etc...
In parallel to the security plan, the following security policies must be followed:
Assign privileges to users (or roles) at the database level and not at the
application level
Rely on the audit trail to identify user activities
Every user must have his own username and password
Educate users not to give own username and password to others
Data Storage and Space Allocation
8.2.1 Disk Storage
Suppliers must deliver a plan on how they plan to store the database files on the
physical hard disk. The principle is simple and consists of “not putting all the eggs in the
same basket” in order to minimize the risk of physical disk failure and to improve
performance (as disk read/write operations are still to date much slower than memory
access). The details depend on the hardware configuration at hand, which cannot be
completely covered by this segment.
However, the standard offers the following guidelines:
Identify tables with similar functionality hence similar access frequency and
balance them across the physical disks so as not to use one physical disk more
than the others. The different table groups are: frequently accessed lists of values
tables, indexes, audit, journal and other read-only tables, etc...
Database Systems
Page 23
Identify the initial and expected size of tables after three years of operation and
plan the initial storage allocation accordingly
Remember that the disk mean time between failures (MTBF) is considered to be
about 5.7 years by the industry. Therefore, increase the vigilance with time as
opposed to getting comfortable with the system with time.
A note must be added about RAID technology. (Please refer to the Hardware Segment
for a review this hard disk technology).
Redundant Arrays of Independent Disks (RAID) is a technology that arranges several
disks, controlled by software to simulate having one large disk. The RAID comes in
different levels. Depending on its level, RAID mirrors the data across the disks with the
result that only a percentage of the total disks space is available to the user.
RAID uses the rest of the disk space to store mirror data and to automatically restore it
upon the failure of one or more disks. With RAID, the database administrator may either
let the RAID control the storage of the data files on individual disks or he may partition
the logical disks himself.
8.2.2 Index Selection
The following guidelines for index selection are recommended:
All primary keys and foreign keys must be indexed as a minimum
Additional indexes can be added according to specific query needs. Attributes
used in the WHERE clause of a SQL statement are good candidates for indexing.
Additional indexes add on DBMS workload. The benefits of indexing outweigh the
overhead cost associated with its maintenance. Adding indexes is one of the first
actions to be considered to improve query and update performance. However,
reevaluating indexes and dropping some should also be considered in the
optimization of a particular update operation if it is taking too long.
Hash versus tree index: B+ tree index is usually preferable because of its
versatility with both range queries and equality queries. It is usually the default
indexing mechanism in all DBMS. However, hash clustering can be used for lists
of values tables (i.e., genders, sex, title, etc.)
Clustering indexes can sometimes lead to performance benefits. It is recommended that
the supplier not cluster any indexes because this task is best suited for the database
administrator in charge of operating the database.
Tuning the Database
The purpose of tuning is to adjust the parameters of the database engine in the light of
the changes that are affecting it during operation. Some known causes for the decay of
performance during operation are the increase in the number of the users and the
increase in the number of operations with time.
The database is tuned during the design phase for the anticipated operation phase.
However, no matter how careful the designers are, the tuning must be reevaluated
periodically during the operation phase. Preventive routine tuning is the best guard
against performance degradation.
Database Systems
Page 24
The Database administrator (DBA) is responsible for tuning the database system. The
DBA should establish measurable tuning goals. Recommended measurable tuning goals
Response times: Response times address how long it takes for a user to receive
data from a request, i.e., the result set of a query, or the time it takes to update
a table, or generate a report.
Database availability: Backups, changing tuning parameters and other
housekeeping should be done as fast as possible.
Memory utilization: Excessive paging and swapping can impact database and
operating system performance.
Disk utilization: Contention for disks should be kept to a minimum. The
distribution of the data on disks shall be monitored for early detection of lack of
free space in disks and table spaces.
The specificity of the tuning performed is bound to the database engine and vendor so
more cannot be elaborated on the subject by this segment.
However, the guidelines specify that routine preventive tuning be performed on database
systems twice a year as part of the maintenance agreement. The database vendor
providing the maintenance is the best candidate to perform tuning on the database with
the coordination of the DBA.
Auditing the Database
Auditing is the ability to trace user access and manipulation of data.
There is no magic solution to this problem. Usually, the DBA sets programs to trace a
specific problem before finding the culprit. The programs that the DBA can use to audit
access to the database either come with the DBMS package, or are custom developed by
the DBA.
The Audit Trail utilities that come with the DBMS can be parameterized to trace specific
problems, such as tracing a user session, activity (UPDATE, DELETE), or tables (Who is
accessing this particular table). Anything beyond these choices leads to the development
of programs and database triggers.
The guidelines for auditing are:
Define the tables that need to be traced and the activities of interest on these
Define specific fields that need to be traced. For example, change in the value of
the salary.
Add username, date and time stamp fields in every table. Up to four fields can be
used: the user who created the record, date and time of creation, the last user
who modified the record, date and time of last modification. These fields by
themselves are not particularly helpful as they record the last modification action.
In order to obtain a log of any and all modifications on a table, the Audit Trail
must be used in conjunction with these fields.
Database Systems
Page 25
Records of sensitive tables are never deleted; rather they are moved into a
history table with the same name as the original table along with the name of the
user who deleted the record and the date and time of deletion.
The same activity can be made to take place for updates, i.e., the before image is
moved into a history table
Use database triggers and not application procedures for recording records in the
history tables
The DBA must clean up the Audit Trail and the History tables periodically.
Backup, Disaster Recovery and Contingency Planning
These issues are discussed in the Segment on Security. However, here is a list of issues
that are Database Systems specific:
A Contingency Plan shall be developed by the DBA to document how many days
of work are lost due to a crash when the best recovery is possible and how the
data lost can be recovered. To minimize the department’s down time, the
contingency plan must describe the methods used to revert to manual operations
and the methods of re-entering the manual data into the database once it has
been recovered. Usually in database systems a paper record exists for
transactions being entered (application forms, change of address forms, etc...)
Such records must be identified by the plan as well as the methods for reverting
to manual operation and to go back to the automated system.
Logical Backups are not acceptable if they are the only method of backup being
performed. Logical backups are software specific copies of the database. Logical
backups are performed using a tool available in the DBMS and this is why they
cannot be dependable. If the version of the database is upgraded, or if the DBMS
is not available, the logical backup is also unavailable and all hopes of recovering
dissipate. The Export and Import utilities are examples of logical backups.
Physical Backups (disk to disk copy) must be available for the whole database
even if that means shutting the database down periodically. The recommended
frequency is weekly and always before upgrades.
The DBA must be trained on the DBMS with regard to Disaster Recovery. DBMS
are offering sophisticated methods for recovery. Some DBMS allow recovery of
the database from log files and archive files and it is not necessary to restore the
database from other media at every failure.
The database scripts, commands and programs used to create the database and
populate it, or used to upgrade the database system must be secured and made
available. A clear documentation of the installation and upgrade procedures must
also be available. This is in case the database is unrecoverable and needs to be
built from scratch.
Database Systems
Page 26
Documenting the Database System
From its inception in the design phase of the software development process to its daily
operation requirements, the database system needs a set of complementary documents
to explain it and manage it. The various sections of this standard name and describe the
relevant documentation needed for the section.
The following table summarizes the list of the documentation needed along with the
section number of this segment where the document is discussed.
Document Name
Auditing Requirements
Back Up and Recovery Plan
Data Dictionary
Database System Architecture
Database Tuning Log
Disk Storage and Allocation Plan
ER Diagram (under Data Modeling)
Errors Log (under Roles and Responsibilities)
Installation Log (under Decomposition and Normalization)
List of Database Triggers
List of functions accessing the database
Replication Plan
Security Plan
Figure 5: Database Systems Documentation Checklist
Kindly use the template for this checklist which can be downloaded from OMSAR's
website for ICT Standards and Guidelines at
Roles and Responsibilities During Operation and Maintenance
Software systems cannot operate without human intervention. They need operators who
know how they work.
Consider an application that records employee attendance through the use of a hardware
interface (Hand or finger print machine). The inputs and outputs of such an application
are known and become limited with time (i.e., who is late, who is absent, is Friday
different from other week days, etc...).
Now consider a database system that is used as a financial ERP system. As work
procedures evolve and awareness is spread about the contents of the database, more
and more expectations will be generated about the system. Management would play
WHAT IF scenarios and needs new reports to confirm their suppositions. Through time,
modification in the business rules of the organization will require resetting parameters
values at best and modifying the application source code itself in the worst case. Finally,
user error may cause data corruption and require the intervention of professionals.
Therefore, because of their nature, database systems need on site human resources for
their successful operation, administration and maintenance. Without thorough
understanding of how the system functions, no one can use the database system.
Database Systems
Page 27
Most importantly, a clear statement of who owns and is responsible for the maintenance
of the database is crucial for the successful operation of the database system.
The following roles for the successful operation and maintenance of a database system
are identified:
System Owner
Super User
ICT Manager
Department head (not from
the ICT Unit) in the
organization. The functions
performed by the system
are the functions performed
by his/her department. If
the database system
performs the functions of
more than one department,
then each department head
is a Module Owner.
Is a user appointed by the
System Owner that has in
depth functional knowledge
about one or more modules.
This user can teach others.
Is a regular department
employee or a data entry
Head of the ICT Unit
- Understands the full functionality of
the system
- Reviews error logs with ICT manager
and decides on the action to follow
- Requests new reports
- Requests new functionality
- Uses the system
- Reports errors to the ICT Unit
- Routine control/checking of data
entered by fellow users
Uses the system
Reports errors to the super user
Reports errors to the ICT Unit
Manages the department
Prepares a summary of all errors
(errors log) reported for review and
approval by the system owner
Fixes operating system problems
Fixes hardware problems such as
printers and monitors
Fixes network problems
Develops new reports
Fixes errors from the approved error
log as assigned by the ICT Manager
Updates the system documentation
(User Guide)
Assists users
Understands the database schema
Performs backups and recoveries
Performs routine tuning
Solves database performance
Responsible for database security
and user audit
Figure 6: Roles and Responsibilities
Figure 6 displays the distribution of the people involved within the hierarchy of the
organization. It is of vital importance that each player understands his role and
responsibility. The role of the system owner should not be faded because of computer
Database Systems
Page 28
phobia. The role of the ICT Unit should be restrained to operate and support the system.
The ICT Unit is not the owner of the database system. The database system owner
should be the owner of the business process.
Database Systems
Page 29
The analogy is having the telephone company (who is technically responsible for the
telephone lines) change the telephone numbers and the ring volume without the prior
knowledge and consent of the subscribers.
Figure 7: Human Resources Involved in the Operation
Note that when users report errors or request help, any of the following can be true:
The user is misusing the system
The user is unaware of the existence of the functionality needed in the system
The user has uncovered a bug (an anomaly in computing). It is important to
teach users to try to repeat the error because knowing the sequence that caused
the error is the first step in correcting it
The user has a hardware, network, or operating system problem
The user is requesting a new functionality. It is especially important to have a
clearly defined policy that identifies what is an error and what is a change
request. This process is explained in the Segment on Change Management.
The role of the ICT Unit staff is to keep an Error Log to differentiate between the types
of errors, especially between bugs and new functionality. New functionality needs the
approval and the planning of the system owner.
Note: Even though this section states how the change request originates, it does not
concern itself with how to perform the change requested. The Configuration Management
segment takes care of that.
Database Systems
Page 30
Two types of training are specified for database systems operators:
Formal Technical Training
Functional Training
Technical training mostly concerns the ICT Unit. Topics should include
Database administration, tuning and especially backup and recovery
Reports development
Decision support systems
Network administration
Database system users can also attend technical training. The topics should be
restrained to operating system and word processing. The object of the training is to
increase the user’s confidence in computing and to reward the user.
Functional training is administered by the supplier of the application or COTS. It is
essential that a group of super users be identified early on and trained thoroughly on the
system. The group of super users may administer additional functional training.
Functional training should be administered periodically (i.e., every year) to revive the
awareness about the database system.
Database Systems
Page 31