Download a. Database

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

DBase wikipedia , lookup

Tandem Computers wikipedia , lookup

Serializability wikipedia , lookup

IMDb wikipedia , lookup

Microsoft Access wikipedia , lookup

Btrieve wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Oracle Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Ingres (database) wikipedia , lookup

SQL wikipedia , lookup

Concurrency control wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

PL/SQL wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

ContactPoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Product Name Here
Tivoli Software
(via Subject - File>Prop>Summary)
Relational Database Design and
Performance Tuning for DB2 Database
Servers
How to document
Document Version: 1.0
Document Status: <Draft >
NOTE: The hard copy version of this document is FOR REFERENCE ONLY. Online version is the master
It is the responsibility of the user to ensure that they have the current version.
Any outdated hard copy is invalid and must be removed from possible use.
IBM SWG - Tivoli Software
It is also the responsibility of the user to ensure the completeness of this document prior to use.
3901 S. Miami Blvd.
Durham, NC 27703


+1 (919) 224-1598
+1 (919) 224-2560
Issue Date:
Author:
Edward Bernal
Template version 4
IBM Confidential When Filled In
1
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
HOW TO DOCUMENT .............................................................................................................................................................. 1
1.
DOCUMENT CONTROL ................................................................................................................................................... 3
1.1
2.
SUMMARY OF CHANGES .................................................................................................................................................. 3
INTRODUCTION ................................................................................................................................................................ 4
2.1
DEFINITIONS ................................................................................................................................................................... 4
2.2
PERFORMANCE FACTORS ................................................................................................................................................ 5
2.2.1
Database Design .................................................................................................................................................... 5
2.2.2
Application Design ................................................................................................................................................. 5
2.2.3
Hardware Design and Operating System Usage .................................................................................................... 5
3.
DATABASE DESIGN DETAILS ....................................................................................................................................... 5
3.1
KEY FACTORS ................................................................................................................................................................. 6
3.1.1
Tablespaces ............................................................................................................................................................ 6
3.1.2
Tables ..................................................................................................................................................................... 7
3.1.3
Bufferpools ............................................................................................................................................................. 8
3.1.4
Logging .................................................................................................................................................................. 9
3.1.5
Indexing ................................................................................................................................................................ 10
3.2
DATABASE MAINTENANCE ........................................................................................................................................... 14
3.2.1
REORG ................................................................................................................................................................. 14
3.2.2
RUNSTATS ........................................................................................................................................................... 15
3.2.3
REBIND ................................................................................................................................................................ 16
4.
APPLICATION DESIGN DETAILS ............................................................................................................................... 17
4.1
KEY FACTORS ............................................................................................................................................................... 17
4.1.1
Review application SQL for efficiencies *** IMPORTANT *** .......................................................................... 17
4.1.2
Concurrency Control and Isolation Level ............................................................................................................ 17
4.1.3
Locking ................................................................................................................................................................. 18
4.1.4
Fetch needed columns only .................................................................................................................................. 19
4.1.5
Reuse resources .................................................................................................................................................... 19
4.1.6
SQL Statement Preparation.................................................................................................................................. 19
4.1.7
Query Tuning........................................................................................................................................................ 20
5.
HARDWARE DESIGN AND OPERATING SYSTEM USAGE ................................................................................... 21
5.1
KEY FACTORS ............................................................................................................................................................... 21
5.1.1
Memory ................................................................................................................................................................ 21
5.1.2
CPU ...................................................................................................................................................................... 21
5.1.3
I/O ........................................................................................................................................................................ 22
5.1.4
Network ................................................................................................................................................................ 23
6.
TUNING OPPORTUNITIES ............................................................................................................................................ 24
6.1
6.2
6.3
6.4
6.5
7.
INSERT/DELETE PERFORMANCE .................................................................................................................................... 24
DATABASE MANAGER CONFIGURATION TUNING .......................................................................................................... 25
DATABASE CONFIGURATION TUNING ........................................................................................................................... 26
BUFFERPOOLS ............................................................................................................................................................... 27
REGISTRY VARIABLES .................................................................................................................................................. 27
MONITORING TOOLS.................................................................................................................................................... 28
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 2 of 35
Product Name Here
Tivoli Systems
8.
(via Subject - File>Prop>Summary)
TUNING METHODOLOGY ............................................................................................................................................ 28
8.1
TUNING APPROACH ....................................................................................................................................................... 28
8.2
SKILLS NEEDED ............................................................................................................................................................ 29
8.3
GENERAL PROCESS ....................................................................................................................................................... 29
8.4
DB2 SPECIFIC TUNING .................................................................................................................................................. 30
8.4.1
SQL Reviews ......................................................................................................................................................... 30
8.4.2
Database Statistics ............................................................................................................................................... 31
8.4.3
SNAPSHOT and EVENT Monitors ....................................................................................................................... 32
8.4.4
DB2BATCH .......................................................................................................................................................... 34
9.
PUBLICATIONS & ONLINE HELP ............................................................................................................................... 34
10.
SIZING ............................................................................................................................................................................ 34
11.
REFERENCE ................................................................................................................................................................. 35
12.
ADDENDUM .................................................................................................................................................................. 35
1.
Document Control
1.1
Summary of Changes
The table below contains the summary of changes:
Version
1.0
Date
Description of changes
Distribution of 1st draft
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 3 of 35
Product Name Here
Tivoli Systems
2.
(via Subject - File>Prop>Summary)
Introduction
Any performance tuning requires some knowledge of the application involved. Relational Database
systems are no different. In fact, the more knowledge available about an application, the better one can
make decisions that will positively affect the performance of that application. While it is possible to
perform some amount of tuning of Relational Database Systems after the fact, the more attention you pay
to the overall design of your entire system up front will generally yield the best results.
While much of this material is applicable to DB2 on all available platforms, this paper specifically deals
only with DB2 on distributed platforms, i.e. Windows, UNIX, and Linux. The material was obtained
from readily available publications on DB2 and from over 18 years experience with designing,
developing, and tuning DB2 systems. While DB2 specific, many of the concepts are applicable to
Relational Databases in general, such as Oracle and Microsoft SQL Server.
This document is not intended to replace the detailed information that is available in various manuals,
Redbooks, etc., that deal specifically with DB2 performance. The intent is to point out some of the major
factors that affect DB2 performance and, hopefully, make it an easy to use and understand reference so
that the user will not have to read through and understand in detail all of the previously mentioned
reference material. The detailed material can, and should, be referenced when addressing a specific area
of concern. Refer to Section 9 for a listing of some of these publications. It is important to note that all of
these factors should be addressed for each application developed.
2.1
Definitions




Throughput - The amount of data transferred from one place to another or processed in a
specified amount of time. Data transfer rates for disk drives and networks are measured in
terms of throughput. Typically, throughputs are measured in kbps, Mbps and Gbps.
Optimizer - When an SQL statement needs to be executed, the SQL compiler needs to
determine the access plan to the database tables. The optimizer creates this access plan, using
information about the distribution of data in specific columns of tables and indexes if these
columns are used to select rows or join tables. The optimizer uses this information to estimate
the costs of alternative access plans for each query. Its decisions are heavily influenced by
statistical information about the size of the database tables and available indexes.
Clustered Index - An index whose sequence of key values closely corresponds to the sequence
of rows stored in a table. The degree to which this correspondence exists is measured by
statistics that are used by the optimizer
Cardinality – with respect to tables, the number of rows in the table. With respect to indexed
columns, the number of distinct values of that column in a table.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 4 of 35
Product Name Here
Tivoli Systems

2.2
(via Subject - File>Prop>Summary)
Prefetch - an operation in which data is read before, and in anticipation of, its use. DB2
supports the following mechanisms:
 sequential prefetch - A mechanism that reads consecutive pages into the buffer pool before
the pages are required by the application.
 List Prefetch - Sometimes called list sequential prefetch. Prefetches a set of nonconsecutive data pages efficiently.
Performance Factors
There are a number of areas that factor into the overall performance of any application. Below are the
general areas of concern along with an explanation of each.
2.2.1
Database Design
The term “Database Design” can mean a lot of things to a lot of people. There are two main types of data
models: a logical and a physical. A “Logical” model is a representation, often graphical in nature, of the
information requirements of a business area, it is not a database. Its main purpose is to ensure that its
structure and content can represent and support the business requirements of an area of interest. It is
independent of any database technology.
After completing your logical database design, there are a number of issues you should consider about the
physical environment in which your database and tables will reside. These include understanding the files
that will be created to support and manage your database, understanding how much space will be required
to store your data, and determining how you should use the tablespaces that are required to store your
data.
This document deals only with the aspects of physical database design.
2.2.2
Application Design
For the purposes of this document, application design deals with aspects of how you access your database
system. There are a number of techniques that will be discussed that, if used, can positively influence the
performance of your application.
2.2.3
Hardware Design and Operating System Usage
For any database system, there are a number of common areas that need to be addressed and sized
appropriately in order to support your application workload. This section will discuss common, and
platform specific, hardware and operating system components.
3.
Database Design Details
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 5 of 35
Product Name Here
Tivoli Systems
3.1
(via Subject - File>Prop>Summary)
Key Factors
3.1.1
Tablespaces
A tablespace is a physical storage object that provides a level of indirection between a database and the
tables stored within the database. It is made up of a collection of containers into which database objects
are stored. A container is an allocation of space to a table space. Depending on |the table space type, the
container can be a directory, device, or file. The data, index, long field, and LOB portions of a table can
be stored in the same table space, or can be individually broken out into separate table spaces.
When working with database systems, the main objective is to be able to store and retrieve data as quickly
and efficiently as possible. One important consideration when designing your database or analyzing a
performance problem on an existing database is the physical layout of the database itself. DB2 provides
support for two types of tablespaces:
 System Managed Space (SMS) - stores data in operating system files. They are an excellent choice
for general purposes use. They provide good performance with little administration cost.
 Database Managed Space (DMS) - with database-managed space (DMS) table spaces, the
database manager controls the storage space. A list of devices or files is selected to belong to a
tablespace when it is defined. The space on those devices or files is managed by the DB2 database
manager. There is some additional administration cost with DMS tablespaces primarily due to
monitoring and adjusting the size of the pre-allocated files. A DMS tablespace can be easily
increased in size by either ALTERing an existing container or adding a new container to it.
3.1.1.1



Recommendations
DMS tablespaces usually perform better than SMS tablespaces because they are pre-allocated and
do not have to spend time extending files when new rows are added. DMS tablespaces can be
either raw devices or file system files. DMS table spaces in raw device containers provide the best
performance because double buffering does not occur. Double buffering, which occurs when data
is buffered first at the database manager level and then at the file system level, might be an
additional cost for file containers or SMS table spaces.
If you use SMS tablespaces, consider using the db2empfa command on your database. The
db2empfa (Enable Multipage File Allocation) tool enables the use of multipage file allocation for
a database. With multipage file allocation enabled for SMS table spaces, disk space is allocated
one extent rather than one page at a time, improving INSERT throughput.
Using DMS tablespaces also allows a single table to store its data, index, and large objects on up
to three different DMS tablespaces, thus improving performance through parallel disk I/O
operations. Parallel I/O is the process of reading from or writing to two or more I/O devices at the
same time to reduce response time For example, look at the following statement (assume there are
no large object data types):
CREATE TABLE T1
(COLA etc.
IN TS1 INDEX IN TS2
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 6 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
CREATE INDEX IX1 ON T1
(COLA ASC)
The table data will be placed in tablespace TS1 and the index data will be place in tablespace TS2.
It would be important to put the different tablespaces on different disk drives in order to enable the
possibility of parallel I/O operations. In fact, each tablespace can have multiple containers, and
each of those containers could be on a different disk drive. The ideal configuration has to consider
a number of factors, such as the number of disks available, RAID level, etc.
3.1.1.2
Platform Specific Recommendations
Windows
File system caching is performed as follows:
 For DMS file containers (and all SMS containers), the operating system might cache pages in the
file system cache
 For DMS device container table spaces, the operating system does not cache pages in the file
system cache.
 On Windows, the DB2 registry variable DB2NTNOCACHE specifies whether or not DB2 will
open database files with the NOCACHE option. If DB2NTNOCACHE=ON, file system caching
is eliminated. If DB2NTNOCACHE=OFF, the operating system caches DB2 files. This applies to
all data except for files that contain LONG FIELDS or LOBS. Eliminating system caching allows
more memory to be available to the database so that the buffer pool or sortheap can be increased.
3.1.2
Tables
As previously discussed, before creating your physical database tables, you should draw a logical design
of your data model. We will briefly discuss aspects of logical database design that affect performance.
3.1.2.1
Normalization
This is the process of restructuring a data model by reducing its relations to their simplest forms. It is a
key step in the task of building a logical relational database design. Normalization reduces redundancy
from your data and can improve the performance of update and delete statements, since you only have to
do it in one place. By normalizing your data, you try to ensure that all columns in the table depend on the
primary key. The disadvantage of a fully normalized data structure is in data retrieval operations,
specifically when a query is accessing a large number of related pieces of data from different tables via
join operations. For more information about Normalization, author C.J. Date is one of the better
resources. Access to his works can be located by searching the internet.
3.1.2.2
Denormalization
The intentional duplication of columns in multiple tables whose consequence is increased data
redundancy. Denormalization is sometimes necessary to minimize performance problems and is a key
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 7 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
step in designing a physical relational database design. The decision about whether to denormalize or not
should be based on the following:


3.1.2.3
Can you utilize and implement some of the other methods described to tune your database and
improve performance to an acceptable level without denormalizing
Can you quantify the likely performance gains by denormalizing, and are they a reasonable tradeoff against the added update overhead?
Other considerations

NULLs - In general, columns defined as NOT NULL perform better than nullable columns due to
the path length reduction. The database manager does not have to check for null values in a NOT
NULL column. Also, every nullable column requires one extra byte per column value. Use
NULLs where appropriate and not as a default.

Column lengths – you should define your column lengths, particularly VARCHAR columns, as
small as possible for your application. By doing this there will be space savings which may lead
to a reduced number of used table and index pages, and index levels, which can improve query
performance.
o If you create an index on a column defined as VARCHAR, all index entries for that
column take the maximum length of the VARCHAR definition, even is the actual data
length in the column is smaller than the maximum.

Identity Columns - Significant performance enhancement can be realized using DB2 generated
identity values compared to those implemented by an application. They are typically used for
generating unique primary key values.

Put frequently updated columns together and at the end of the row. This has an effect on update
performance due to the following logging considerations:
o For fixed lengths row updates, DB2 logs from the first changed column to the last changed
column
o For variable length row updates, DB2 logs from the first changed byte to the end of the
row. If the length of a variable length column changes, this will result in a change to the
row header (which includes the row length), and thus the entire row will be logged.
3.1.3
Bufferpools
A bufferpool is an area of memory into which database pages are read, modified, and held during
processing. On any system, accessing memory is faster than disk I/O. DB2 uses database buffer pools to
attempt to minimize disk I/O.
There is no definitive answer to the question of how much memory you should dedicate to the buffer
pool. Generally, more is better. A good rule of thumb would be to start with about 75% of your system’s
main memory devoted to buffer pool(s), but this rule is applicable only if the machine is a dedicated
database server. Since it is a memory resource, its use has to be considered along with all other
applications and processes running on a server.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 8 of 35
Product Name Here
Tivoli Systems

(via Subject - File>Prop>Summary)
A spreadsheet will be provided later in this document that can be use to estimate DB2 memory
usage
If your tablespaces have multiple page sizes, then you should create only one buffer pool for each page
size.
There are some cases where defining multiple buffer pools of the same size can improve performance,
but, if badly configured, can have a huge negative impact on performance. Consider the following when
deciding to create multiple buffer pools:
 You create tables which reside in table spaces using a page size other than the 4 KB default. This
is required (as mentioned above).
 You have tables which are accessed frequently and quickly by many short update transaction
applications. Dedicated buffer pool(s) for these tables may improve response times.
 You have tables larger than main memory which are always fully scanned. These could have their
own dedicated buffer pool.
3.1.4
Logging
On of the main purposes of all database systems is to maintain the integrity of your data. All databases
maintain log files that keep records of database changes. DB2 logging consists of a set of primary and
secondary log files that contain log records that record all changes to a database. The database log is used
to roll back changes for units of work that are not committed and to recover a database to a consistent
state. DB2 provides two logging strategy choices.
3.1.4.1
Circular logging
This is the default log mode. With circular logging, the log records fill the log files and then overwrite the
initial log records in the initial log file. The overwritten log records are not recoverable. This type of
logging is typically not suited for a production application.
3.1.4.2
Log Retain logging
Each log file is archived when it fills with log records. New log files are made available for log records.
Retaining log files enables roll-forward recovery. Roll-forward recovery reapplies changes to the database
based on completed units of work (transactions) that are recorded in the log. You can specify that rollforward recovery is to the end of the logs, or to a particular point in time before the end of the logs.
Archived log files are never directly deleted by DB2, therefore, it is the applications responsibility to
maintain them; i.e. archive, purge, etc.
3.1.4.3
Log Performance
Ignoring the performance of your database in relation to its logging can be a costly mistake, the main cost
being time. Placement of the log files needs to be optimized, not only for write performance, but also for
read performance, because the database manager will need to read the log files during database recovery.
3.1.4.4

Recommendations
Use the fastest disks available for your log files
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 9 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
o Use a separate array and/or channel if possible

Use Log Retain logging

Mirror your log files

Increase the size of the database configuration Log Buffer parameter (logbufsz)
o This parameter specifies the amount of the database heap to use as a buffer for log records
before writing these records to disk. The log records are written to disk when one of the
following occurs:

A transaction commits, or a group of transactions commit, as defined by the
mincommit configuration parameter.

The log buffer is full.

As a result of some other internal database manager event.
o Buffering the log records will result in more efficient logging file I/O because the log
records will be written to disk less frequently, and more log records will be written at each
time.

3.1.5
Tune the Log File Size (logfilsiz) database configuration parameter so that you are not creating
excessive log files.
Indexing
An index is a set of keys, each pointing to a row, or rows in a table. An index serves two primary
purposes:
1. To ensure uniqueness, as in the case of a Primary Key. Unique indexes can be created to ensure
uniqueness of the index key. An index key is a column or an ordered collection of columns on
which an index is defined. Using a unique index will ensure that the value of each index key in the
indexed column or columns is unique.
2. To allow more efficient access to rows in a table by creating a direct path to the data through
pointers. The SQL optimizer automatically chooses the most efficient way to access data in
tables. The optimizer takes indexes into consideration when determining the fastest access path to
data.
The main advantages of indexes were pointed out above. Creating an index to ensure uniqueness is
typically driven by a business requirement of the application, and thus, are absolutely necessary to create.
Beyond that, you should be very careful about the number and size of the indexes that you create on your
database tables. Each additional index will have an impact on the following:
 disk storage
 INSERT and DELETE Processing
o CPU
o I/O
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 10 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
o Logging

Database Maintenance
o REORG
o RUNSTATS
3.1.5.1
Index Considerations
The creation of any index for the purpose of efficient access should be done based on a review of the
actual SQL that has been, or will be, written against the tables.
In addition to the above mentioned comment, the following items need to be considered when
determining which indexes to create.
 The size of the index, determined by:
o The number and size of the columns in the index
o The projected, or actual, volume of data in your tables.
Indexes are implemented as a B-Tree structure, with a root level, intermediate level(s), and leaf
pages (the lowest level) which contain actual pointers to the data. A typical index on a moderately
sized table would be 3 levels deep. That means that if the optimizer decided to use this index to
read the data, it would have to do a minimum of 4 I/O’s to satisfy the query:
1. the index root page
2. one intermediate level page
3. one leaf page
4. one data page
As the size of the index increases, the number of pages needed to store the index entries and/or the
number of intermediate levels increases as well.

The Cardinality of the indexed columns – one of the most often overlooked issues when creating
an index. Consider the following example to illustrate the point:
CREATE TABLE T1
(COLA INTEGER
COLB SMALLINT
COLC CHAR(10)
NOT NULL,
NOT NULL,
NOT NULL)
CREATE UNIQUE INDEX IX1 ON T1
(COLA ASC) CLUSTER
CREATE INDEX IX2 ON T1
(COLB ASC)
For this example, assume
1. T1 has 100,000,000 rows
2. the cardinality of COLB is 4, with an even distribution among the rows
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 11 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
3. we are not considering Multi-Dimensional Clustering (MDC)
4. the following query is typically run against this table
SELECT COLA, COLC
FROM T1
WHERE COLB = 3
On the surface, this looks like it all makes sense. We have a SELECT statement with an
“=” predicate on COLB, and we have an index defined on that column, what could be
better!
This table has a clustering index defined on COLA. The intent of a clustering index is so
that the sequence of key values closely corresponds to the sequence of rows stored in a
table. Each table can only have one clustering index, so our index on COLB is not
clustered.
Assuming an equal distribution of rows across the COLB predicate, our WHERE clause,
“COLB = 3”, will return 25,000,000 rows to our query: 100,000,000/4.
It is highly unlikely that the DB2 optimizer would select the use of the COLB index (IX2)
to satisfy this query, due to high I/O cost. It would more likely decide to scan the entire
table by taking advantage of sequential Prefetch.
Since this index would not be used to satisfy the query, there would only be a negative
impact by creating it as discussed above; i.e. disk, CPU, INSERT and DELETE processing,
etc.

The order of the columns in the index – another of the most often overlooked issues when creating
an index. Consider the following example to illustrate the point
CREATE TABLE T1
(COLA INTEGER
COLB SMALLINT
COLC CHAR(10)
COLD INTEGER)
NOT NULL,
NOT NULL,
NOT NULL,
CREATE UNIQUE INDEX IX1 ON T1
(COLA ASC)
CREATE INDEX IX2 ON T1
(COLB ASC,
COLC ASC,
COLD ASC) CLUSTER
For this example, assume
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 12 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
1. T1 has 100,000,000 rows
2. the following query is typically run against this table
SELECT COLB, COLD
FROM T1
WHERE COLC
= ‘DATA’
AND COLD
> 42
We have a SELECT statement with an “=” predicate on COLC, a second predicate with a
“>” predicate on COLD, and we have an index (IX2) defined on those columns, plus
COLB, which is also in our SELECT list.
Will the DB2 optimizer select index IX2 to satisfy this query? The answer is not so clear.
Remember that DB2 indexes are created as a B-tree structure. The first column of the IX2
index is COLB, but we have not provided a predicate for that column in our where clause,
so DB2 can not effectively navigate the B-tree since the high order part of the key was not
provided.
This does not, however, eliminate the possibility of the use of this index. Since all of the
data referenced (COLB, COLC, and COLD) is in the index, DB2 has two possibilities,
either scan the entire table, or scan the entire index. Since there would be fewer pages in
the index, and since there is no need to go to the data pages, an index scan would probably
be selected. Index scans are also eligible for Prefetch operations.
If this was a typical query used against this table, a better definition of the IX2 index would
be to order the columns as follows:
CREATE INDEX IX2 ON T1
(COLC ASC,
COLD ASC,
COLB ASC) CLUSTER
This would allow effective use of the B-tree structure by providing matching values on the
high order part of the key.
These examples were provided for illustration only. You need to consider all of the database access
queries against your tables, not just an individual SELECT statement, in order to determine the best index
strategy.
3.1.5.2
Index Recommendations
o Create as few indexes as possible
o Consider creating the INDEXES with the “ALLOW REVERSE SCANS” option
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 13 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
o Pay close attention to the order of the columns in the index
o Don’t create redundant indexes
o Use DB2 “Explain” facilities to determine the actual usage of the indexes
3.2
Database Maintenance
Regular maintenance is a critical factor in the performance of a database environment. This involves
running the Reorg, Runstats, and Rebind facilities, in that order, on the database tables. A regularly
scheduled maintenance plan is essential in order to maintain peak performance of your system.
3.2.1
REORG
After many changes to table data, caused by INSERT, DELETE, and UPDATE of variable length
columns activity, logically sequential data may be on non-sequential physical data pages so that the
database manager must perform additional read operations to access data. You can reorganize DB2 tables
to eliminate fragmentation and reclaim space using the REORG Command.

Significant reductions in elapsed times due to improved I/O can result from regularly scheduled
REORG’s.

DB2 provides two types of REORG operation.
o Classic REORG

Provides the fastest method of REORG

Indexes are rebuilt during the reorganization

Ensures perfectly ordered data

Access is limited to read-only during the UNLOAD phase, no access during other
phases

Is not re-startable
o In-Place REORG
3.2.1.1


Slower than the Classic REORG, takes longer to complete

Does not ensure perfectly ordered data or indexes

Requires more log space

Can be paused and re-started

Can allow applications to access the database during reorganization
Recommendations
Implement a regularly scheduled maintenance plan
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 14 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)

If you have an established database maintenance window, use the Classic REORG

If you operate a 24 by 7 operation, use the In-Place REORG
3.2.2
RUNSTATS
It was mentioned earlier that the DB2 optimizer uses information and statistics in the DB2 catalog in
order to determine the best access to the database based on the query provided. Statistical information is
collected for specific tables and indexes in the local database when you execute the RUNSTATS utility.
When significant numbers of table rows are added or removed, or if data in columns for which you collect
statistics is updated, execute RUNSTATS again to update the statistics. Use the RUNSTATS utility to
collect statistics in the following situations:


When data has been loaded into a table and the appropriate indexes have been created.
When you create a new index on a table. You need execute RUNSTATS for only the new index if
the table has not been modified since you last ran RUNSTATS on it.
 When a table has been reorganized with the REORG utility.
 When the table and its indexes have been extensively updated, by data modifications, deletions,
and insertions. (“Extensive” in this case may mean that 10 to 20 percent of the table and index
data has been affected.)
 Before binding, or rebinding, application programs whose performance is critical
 When you want to compare current and previous statistics. If you update statistics at regular
intervals you can discover performance problems early.
 When the prefetch quantity is changed.
 When you have used the REDISTRIBUTE DATABASE PARTITION GROUP utility.
There are various formats of the RUNSTATS command, mainly determining the depth and breadth or
statistics collected. The more you collect, the longer the command takes to run. Some of the options are
as follows:



3.2.2.1


Collect either SAMPLED or DETAILED index statistics
Collecting statistics on all columns or only columns used in JOIN operations
Collecting distribution statistics on all, key, or no columns. Distribution statistics are very useful
when you have an uneven distribution of data on key columns
Recommendations
Care must be taken when running RUNSTATS since the information collected will impact the
optimizer’s selection of access paths.
Implement as part of a regularly scheduled maintenance plan if some of the above mentioned
conditions occur
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 15 of 35
Product Name Here
Tivoli Systems


3.2.3
(via Subject - File>Prop>Summary)
To ensure that the index statistics are synchronized with the table, execute RUNSTATS to collect
both table and index statistics at the same time.
Consider some of the following factors when deciding what type of statistics to collect
o Collect statistics only for the columns used to join tables or in the WHERE, GROUP BY,
and similar clauses of queries. If these columns are indexed, you can specify the columns
with the ONLY ON KEY COLUMNS clause for the RUNSTATS command.
o Customize the values for num_freqvalues and num_quantiles for specific tables and
specific columns in tables.
o Collect DETAILED index statistics with the SAMPLE DETAILED clause to reduce the
amount of background calculation performed for detailed index statistics. The SAMPLE
DETAILED clause reduces the time required to collect statistics, and produces adequate
precision in most cases.
o When you create an index for a populated table, add the COLLECT STATISTICS clause
to create statistics as the index is created.
REBIND
After running RUNSTATS on your database tables, you need to rebind your applications to take
advantage of those new statistics. This is done to ensure the best access plan is being used for your SQL
statements. How that rebind takes place depends on the type of SQL you are running. DB2 provides
support for the following:

Dynamic SQL - SQL statements that are prepared and executed at run time. In dynamic SQL, the
SQL statement is contained as a character string in a host variable and is not precompiled.

Static SQL - SQL statements that are embedded within a program, and are prepared during the
program preparation process before the program is executed. After being prepared, a static SQL
statement does not change, although values of host variables specified by the statement can
change. These Static statements are stored in a DB2 object called a package
Both Dynamic SQL statements and Packages can be stored in one of DB2’s cache’s. Based on the above
types of SQL, a rebind will take place under these conditions.

Dynamic SQL
o If the statement is not in the cache, the SQL Optimizer will “bind” the statement and
generate a new access plan
o If the statement is in the cache, no “rebind” will take place


To clear the contents of the SQL cache, use the FLUSH PACKAGE CACHE sql
statement
Static SQL
o An explicit REBIND <package> is executed
o Implicitly if the package is marked “invalid”
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 16 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)

3.2.3.1
This can occur if, for example, an index that the package was using has been
dropped.
Recommendations
o Perform a REBIND after running RUNSTATS as part of you normal database maintenance
procedures.
4.
Application Design Details
4.1
Key Factors
4.1.1
Review application SQL for efficiencies *** IMPORTANT ***
If there is any one thing that you should focus on from this entire paper, it is this topic. In a significant
majority of cases, probably the single most important factor when it comes to performance with DB2, is
how efficiently your SQL statements are written. This topic mainly deals with SQL search criteria, which
can be present in SELECT, UPDATE, DELETE, or INSERT (through a subselect) statements.
Reviewing SQL serves the following purposes:

Provides the database designer with the necessary information they need in order to determine the
proper indexes that should be created on your database tables. These statements are essential for
the designer to be able to create the optimal indexes to support your database access. All of the
considerations mentioned above regarding indexes should be considered.

Allows an independent review of the SQL for the purpose of utilizing efficient SQL coding
techniques

Determine if locking strategies are appropriate

Assess the impact of changes in your data model or data content

Assess the impact of the application of service to the database manager

How to review the SQL statements will be discussed in Section 8: Tuning Approach
4.1.1.1
Recommendations
o Implement a formal SQL review process for your application(s)
4.1.2
Concurrency Control and Isolation Level
An isolation level determines how data is locked or isolated from other processes while the data is being
accessed. The isolation level will be in effect for the duration of the unit of work. DB2 supports the
following isolation levels, listed in order of most restrictive to least restrictive:
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 17 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)

Repeatable Read - An isolation level that locks all the rows in an application that are referenced
within a transaction. When a program uses repeatable read protection, rows referenced by the
program cannot be changed by other programs until the program ends the current transaction.

Read Stability - An isolation level that locks only the rows that an application retrieves within a
transaction. Read stability ensures that any qualifying row that is read during a transaction is not
changed by other application processes until the transaction is completed, and that any row
changed by another application process is not read until the change is committed by that process.

Cursor Stability - An isolation level that locks any row accessed by a transaction of an application
while the cursor is positioned on the row. The lock remains in effect until the next row is fetched
or the transaction is terminated. If any data is changed in a row, the lock is held until the change is
committed to the database

Uncommitted Read - An isolation level that allows an application to access uncommitted changes
of other transactions. The application does not lock other applications out of the row that it is
reading, unless the other application attempts to drop or alter the table. Sometimes referred to as
“Dirty Reads”
4.1.2.1
Recommendations
o Make sure you know the isolation level under which you are running. Do not count on default
values, which can change based on how you are accessing the database.
o Because the isolation level determines how data is locked and isolated from other processes while
the data is being accessed, you should select an isolation level that balances the requirements of
concurrency and data integrity for your particular application. The isolation level that you specify
is in effect for the duration of the unit of work.
4.1.3
Locking
To provide concurrency control and prevent uncontrolled data access, the database manager places locks
on tables, table blocks, or table rows. A lock associates a database manager resource with an application,
called the lock owner, to control how other applications can access the same resource. Locking is a
fundamental process of any database manager and is used to ensure the integrity of the data. But while
maintaining those locks, there is a potential impact on the concurrency and throughput of your
application.
There are a number of factors that the database manager uses to determine whether to use row level or
table level locking:

The different isolation levels described above are used to control access to uncommitted data,
prevent lost updates, allow non-repeatable reads of data, and prevent phantom reads. Use the
minimum isolation level that satisfies your application needs.

The access plan selected by the optimizer. Table scans, index scans, and other methods of data
access each require different types of access to the data.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 18 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)

The LOCKSIZE attribute for the table. This parameter indicates the granularity of the locks used
when the table is accessed. The choices are either ROW for row locks, or TABLE for table locks.

The amount of memory devoted to locking. The amount of memory devoted to locking is
controlled by the locklist database configuration parameter.
4.1.3.1
Recommendations
o *** IMPORTANT *** - COMMIT as frequently as possible and/or practical in order to release
any locks your application holds. If possible, design your application so that you can easily vary
the commit frequency for large batch operations. This will allow you to optimally balance the
throughput and concurrency of your system.
o Use ALTER TABLE... LOCKSIZE TABLE for read-only tables. This reduces the number of
locks required by database activity.
o If the lock list fills, performance can degrade due to lock escalations and reduced concurrency on
shared objects in the database. If lock escalations occur frequently, increase the value of either
locklist or maxlocks , or both
4.1.4
Fetch needed columns only
There is additional CPU cost associated with each column selected or fetched from the database. Higher
I/O cost may also be experienced if sorting is required.
4.1.4.1
Recommendations

Select or fetch the columns that you need

Never code “SELECT *” to retrieve all columns in a table
4.1.5
Reuse resources
Consider reuse of the following components:
4.1.5.1

4.1.6
Recommendations
Database connections – this can be accomplished using the connection pooling features of DB2.
Connection pooling is a process in which DB2 drops the inbound connection with an application
that requests disconnection, but keeps the outbound connection to the host in a pool. When a new
application requests a connection, DB2 uses one from the existing |pool. Using the already-present
connection reduces the overall connection time, as well as the high processor connect cost on the
host. Connection pooling is implemented using:
o
DB2 Connect
o
via JDBC using the Websphere connection pooling feature.
SQL Statement Preparation
Before an SQL statement can be executed, it must be converted from text form to an executable form, by
submitting it to the SQL compiler. This is referred to as the SQL Statement Prepare process. After the
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 19 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
statement is prepared, the bind process occurs. This process converts the output from the SQL compiler
to a usable control structure, such as an access plan, application plan, or package. During the bind process,
access paths to the data are selected and some authorization checking is performed. DB2 supports two
types of SQL processing.

Static - SQL statements that are embedded within a program, and are prepared during the program
preparation process before the program is executed. After being prepared, a static SQL statement
does not change, although values of host variables specified by the statement can change.

Dynamic - SQL statements that are prepared and executed at run time. In dynamic SQL, the SQL
statement is contained as a character string in a host variable and is not precompiled.
Static SQL offers the advantage of only executing the statement preparation process once, this eliminating
that processing step each time the statement is executed.
Dynamic SQL statements, by there definition, are prepared and executed at run time. You can, however,
minimize the effect of statement preparation by writing your dynamic SQL statement using parameter
markers. Parameter markers act in a similar fashion to host variables in static SQL statements. If you use
them in a dynamic SQL statement, you would first issue a single PREPARE statement, followed by
multiple EXECUTE statements, which allow you to substitute values for the parameter markers. This
savings can be significant for simple SQL statements, like an INSERT, that are executed many times with
different values.
4.1.6.1

Recommendations
Use static SQL whenever possible
o If using java, SQLJ supports static SQL

If you use dynamic SQL
o code them using parameter markers
o Increase the size of the database package cache. This cache stores dynamic SQL
statements and allows for their reuse
4.1.7
Query Tuning
The following SQL statement clauses may improve the performance of your application.

Use the FOR UPDATE clause to specify the columns that could be updated by a subsequent
positioned UPDATE statement.

Use the FOR READ/FETCH ONLY clause to make the returned columns read only.

Use the OPTIMIZE FOR n ROWS clause to give priority to retrieving the first n rows in the full
result set.

Use the FETCH FIRST n ROWS ONLY clause to retrieve only a specified number of rows.

Use the DECLARE CURSOR WITH HOLD statement to retrieve rows one at a time and maintain
cursor position after a commit
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 20 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)

Take advantage of row blocking, by specifying the FOR READ ONLY, FOR FETCH ONLY,
OPTIMIZE FOR n ROWS clause, or if you declare your cursor as SCROLLing. This will
improve performance, and, in addition, improve concurrency because exclusive locks are never
held on the rows retrieved.

Avoid DISTINCT or ORDER by if not required. This will help to eliminate any potential sorting
that may have to occur.
o Proper indexing may be used to eliminate SORTing also
5.
Hardware Design and Operating System Usage
5.1
Key Factors
This section will just discuss overall considerations for these factors and will not discuss detailed
calculations for capacity planning purposes.
5.1.1
Memory
Understanding how DB2 organizes memory helps you tune memory use for good performance. Many
configuration parameters affect memory usage. Some may affect memory on the server, some on the
client, and some on both. Furthermore, memory is allocated and de-allocated at different times and from
different areas of the system. While the database server is running, you can increase or decrease the size
of memory areas inside the database shared memory. You should understand how memory is divided
among the different heaps before tuning to balance overall memory usage on the entire system. Refer to
the “DB2 Administration Guide: Performance” for a detailed explanation of DB2’s memory model and all
of the parameters that effect memory usage.
5.1.2
CPU
The CPU utilization goal should be about 70 to 80% of the total CPU time. Lower utilization means that
the CPU can cope better with peak workloads. Workloads between 85% to 90% result in queuing delays
for CPU resources, which affect response times. CPU utilization above 90% usually results in
unacceptable response times. While running batch jobs, backups, or loading large amounts of data, the
CPU may be driven to high percentages, such as to 80 to 100%, to maximize throughput.
DB2 supports the following processor configurations:

Uni-Processor – a single system that contains only one single CPU

SMP (Symmetric Multiprocessor) – a single system that can contain multiple CPU’s. Scalability
is limited to the CPU sockets provided on the motherboard.

MPP (Massively Parallel Processors) – a system with multiple nodes connected over a high speed
link. Each node has their own CPU(s). Scalability is achieved by adding new nodes.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 21 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
Things to consider regarding CPU:

Inefficient data access methods cause high CPU utilization and are major problems for database
system. Refer back to section 4.1.1.

Paging and swapping requires CPU time. Consider this factor while planning your memory
requirements.
5.1.3
I/O
The following are rules of thumb that can be used to calculate total disk space required by an application.
If you have more detailed information, use that instead of the ROT’s.

Calculate the raw data size
o Add up the column lengths of your database tables
o Multiply by the number of rows expected

Once you have the raw data size, using the following scaling up ratios to factor in space for
indexing, working space, etc.
o OLTP ratio: 1:3
o DSS ratio: 1:4
o Data warehouse ratio: 1:5
Consider the following to improve disk efficiency:

Minimize I/O – access to main memory is orders of magnitude faster than to disk. Provide as
much memory as possible to the database Bufferpools and various memory heaps to avoid I/O

When I/O is needed, reading simultaneously from several disks is the fastest way. Provide for
parallel I/O operations by:
o Using several smaller disk rather than one big disk
o Place the disk drive(s) on separate controllers
5.1.3.1
Choosing Disk Drives
There are several trends in current disk technology:

They get bigger every year, roughly doubling in capacity every 18 months.

The cost per GB is lower each year.

The cost difference of the two smallest drives diminishes until there is little point in continuing
with the smaller drive.

The disk drives improve a little each year in seek time.

The disk drives get smaller in physical size.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 22 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
While the disk drives continue to increase capacity with a smaller physical size, the speed improvements,
seek, etc., are small in comparison.
A database that would have taken 36 * 1 GB drives a number of years ago can now be placed on one disk.
This highlights the database I/O problems. For example, if each 1 GB disk drive can do 80 I/O operations
a second, this means the system can do a combined 36 * 80 = 2880 I/O operations per second. But a
single 36 GB drive with a seek time of 7 ms can do only 140 I/O operations per second. While increased
disk drive capacity is good news, the lower numbers of disks cannot deliver the same I/O throughput.
5.1.3.2

Recommendations
When determining your I/O requirements consider:
o OLTP Systems


reading data will involve reading indexes

inserts and updates require data, index, and logs to be written
Provide for parallel I/O operations
o Use DMS tablespaces and separate data and indexes in separate tablespaces
o Use the smallest disk drives possible purely on the basis of increasing the number of disks
for I/O throughput. If buying larger drives, use only half the space (the middle area – it’s
the fastest) for the database, and the other half for:
5.1.4

Backups

Archiving data

Off hour test databases

Extra space used for upgrades
Network
The network can influence the overall performance of your application, but usually manifest itself when
there is a delay in the following situations:
o The time between when a client machine sends a request to the server and the server receives this
request
o The time between when the server machine sends data back to the client machine and the client
machine receives the data
Once a system is implemented, the network should be monitored in order to assure that its bandwidth is
not being consumed more than 50%.
5.1.4.1
Recommendations
The following techniques can by used to improve overall performance and avoid high network
consumption:
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 23 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
o Transmit a block of rows to the client machine in a single operation. This is accomplished by
using the BLOCKING option in the pre-compile or bind procedures. Refer to section 4.1.8, Query
Tuning, for other factors that influence row blocking.
o Use stored procedures to minimize the number of accesses to the database. Stored procedures are
programs that reside on the RDBMS server and can be executed as part of a transaction by the
client applications. This way, several pre-programmed SQL statements can be executed by using
only one CALL command from the client machine.
o Using stored procedures will typically make it more difficult to run your application on
different database platforms, like Oracle or SQL Server, because of the syntactical
differences of their stored procedure implementation. So if you need to run your
application on multiple database platforms, be aware of this consideration.
6.
Tuning Opportunities
This section contains some additional tuning considerations not already discussed.
6.1
Insert/Delete Performance
Here are some things to consider about insert and delete performance. The biggest bottlenecks are
typically:
o
o
o
o
Read and Write I/O for index and data
Active Log Write
CPU Time
Locking
We’ve previously discussed ways to address many of these issues, such as:
o Using DMS tablespaces and placing the table and index data into separate tablespaces to enable
parallel I/O.
o Providing for efficient logging
o The use of parameter markers to prepare an INSERT statement once, and execute it many times.
o Batching of SQL Statements
o Via JDBC batch facility
o Minimize Indexing
Here are a few other suggestions to improve insert performance:
o Consider the use of APPEND MODE
o Insert multiple rows with one INSERT statement
o PCTFREE values for Data, Clustering Index, and non-clustering Index components.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 24 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
The following database parameter is important.
o Number of asynchronous page cleaners (NUM_IOCLEANERS) - This parameter controls the
number of page cleaners that write changed pages from the buffer pool to disk. You may want to
increase this to the number of physical disk drive devices you have. The default is 1.
The following database manager parameter may be important.
o Enable intra-partition parallelism (INTRA_PARALLEL) – if you have a multi-processor SMP
system, setting this parameter to YES may improve performance. The default is NO
The following registry variable can be used
o DB2MAXFSCRSEARCH - The setting of this registry variable determines the number of Free
Space Control Records (FSCRs) in a table that are searched for an INSERT. The default
o value for this registry variable is five. If no space is found within the specified number of FSCRs,
the inserted record is appended at the end of the table. To optimize INSERT speed, subsequent
records are also appended to the end of the table until two extents are filled. After the two extents
are filled, the next INSERT resumes searching at the FSCR where the last search ended.
o To optimize for INSERT speed at the possible expense of faster table growth, set the
DB2MAXFSCRSEARCH registry variable to a small number. To optimize for space reuse
at the possible expense of INSERT speed, set DB2MAXFSCRSEARCH to a larger
number.
6.2
Database Manager Configuration Tuning
Each instance of the database manager has a set of database manager configuration parameters (also
called database manager parameters). These affect the amount of system resources that will be allocated
to a single instance of the database manager. Some of these parameters are used for configuring the setup
of the database manager and other, non-performance related information.
There are numerous database manager configuration parameters. I will only list some of the one’s that
have a high impact on performance. Refer to the “DB2 Administration Guide: Performance” for a detailed
explanation of all the database manager configuration parameters.
o Agentpri - This parameter controls the priority given both to all agents, and to other database
manager instance processes and threads, by the operating system scheduler. Use the default
unless you run a benchmark to determine the optimal value.
o Aslheapsz – the application support layer heap represents a communication buffer between the
local application and its associated agent. This buffer is allocated as shared memory by each
database manager agent that is started. If the request to the database manager, or its associated
reply, do not fit into the buffer they will be split into two or more send-and-receive pairs. The size
of this buffer should be set to handle the majority of requests using a single send-and-receive pair.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 25 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
o Intra_parallel - This parameter specifies whether the database manager can use intra-partition
parallelism on an SMP machine. Multiple processors can be used to scan and sort data for index
creation.
o Java_heap_sz - This parameter determines the maximum size of the heap that is used by the Java
interpreter started to service Java DB2 stored procedures and UDFs (User Defined Functions).
o Max_querydegree - This parameter specifies the maximum degree of intra-partition parallelism
that is used for any SQL statement executing on this instance of the database manager. An SQL
statement will not use more than this number of parallel operations within a partition when the
statement is executed. The intra_parallel configuration parameter must be set to “YES” to enable
the database partition to use intra-partition parallelism.
o Sheapthres – the sort heap threshold determines the maximum amount of memory available for all
the operations that use the sort heap including: sorts, hash joins, dynamic bitmaps (used for index
ANDing and Star Joins), and operations where the table is in memory. Ideally, you should set
this parameter to a reasonable multiple of the largest sortheap parameter you have in your
database manager instance. This parameter should be at least two times the largest sortheap
defined for any database within the instance.
6.3
Database Configuration Tuning
Each database has a set of the database configuration parameters (also called database parameters). These
affect the amount of system resources that will be allocated to that database. In addition, there are some
database configuration parameters that provide descriptive information only and cannot be changed;
others are flags that indicate the status of the database.
There are numerous database configuration parameters. I will only list some of the one’s that have a high
impact on performance. Refer to the “DB2 Administration Guide: Performance” for a detailed explanation
of all the database configuration parameters.
o Chngpgs_thresh - Asynchronous page cleaners will write changed pages from the buffer pool (or
the buffer pools) to disk before the space in the buffer pool is required by a database agent. As a
result, database agents should not have to wait for changed pages to be written out so that they
might use the space in the buffer pool. This improves overall performance of the database
applications.
o Locklist - This parameter indicates the amount of storage that is allocated to the lock list. This has
a high impact on performance if there are frequent lock escalations.
o Maxlocks – maximum percent of lock list before escalation. Used in conjunction with the locklist
parameter to control lock escalations.
o Logbufsz - This parameter allows you to specify the amount of the database heap (defined by the
dbheap parameter) to use as a buffer for log records before writing these records to disk.
Buffering the log records will result in more efficient logging file I/O because the log records will
be written to disk less frequently and more log records will be written at each time.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 26 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
o Num_iocleaners - This parameter allows you to specify the number of asynchronous page cleaners
for a database. These page cleaners write changed pages from the buffer pool to disk before the
space in the buffer pool is required by a database agent. As a result, database agents should not
have to wait for changed pages to be written out so that they might use the space in the buffer
pool. This improves overall performance of the database applications. Set this parameter to be
between one and the number of physical storage devices used for the database.
o Num_ioservers - I/O servers are used on behalf of the database agents to perform prefetch I/O and
asynchronous I/O by utilities such as backup and restore. This parameter specifies the number of
I/O servers for a database. A good value to use is generally one or two more than the number of
physical devices on which the database resides.
o Pckcachesz – The package cache is used for caching of sections for static and dynamic SQL
statements on a database. Caching packages and statements allows the database manager to
reduce its internal overhead by eliminating the need to access the system catalogs when reloading
a package; or, in the case of dynamic SQL, eliminating the need for compilation.
o Sortheap - This parameter defines the maximum number of private memory pages to be used for
private sorts, or the maximum number of shared memory pages to be used for shared sorts. Each
sort has a separate sort heap that is allocated as needed, by the database manager. This sort heap is
the area where data is sorted. Increase the size of this parameter when frequent large sorts are
required.
6.4
Bufferpools
The buffer pool is the area of memory where database pages (table rows or indexes) are temporarily read
and manipulated. All buffer pools reside in global memory, which is available to all applications using the
database. The purpose of the buffer pool is to improve database performance. Data can be accessed much
faster from memory than from disk. Therefore, the more data (rows and indexes) the database manager is
able to read from or write to memory, the better the database performance. The default buffer pool
allocation is usually not sufficient for production applications and needs to be monitored and tuned before
placing your application in production.
6.5
Registry Variables
Each instance of the database manager has a set of Registry and Environment variables. These affect
various aspects of DB2 processing.
There are numerous Registry and Environment variables. I will only list some of the one’s that have a
high impact on performance. Refer to the “DB2 Administration Guide: Performance” for a detailed
explanation of all the Registry and Environment variables.
o DB2_Parallel_IO - While reading or writing data from and to table space containers, DB2 may use
parallel I/O for each table space value that you specify. The degree of parallelism is determined by
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 27 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
the prefetch size and extent size for the containers in the table space. For example, if the prefetch
size is four times the extent size, then there are four extent-sized prefetch requests. The number of
containers in the table space does not affect the number of prefetchers. To enable parallel I/O for
all table spaces, use the wildcard character, ″*″. To enable parallel I/O for a subset of all table
spaces, enter the list of table spaces. If there is more than one container, extent-size pieces of any
full prefetch request are broken down into smaller requests executed in parallel based on the
number of prefetchers. When this variable is not enabled, the number of prefetcher requests
created is based on the number of containers in the table space.
7.
Monitoring Tools
DB2 provides several tools that can be used for monitoring or analyzing your database. These monitoring
and analyzing tools, along with their purposes are:
o Snapshot Monitor – captures performance information at periodic points of time. Used to
determine the current state of the database
o Event Monitor - provides a summary of activity at the completion of events such as statement
execution, transaction completion, or when an application disconnects.
o Explain Facility - provides information about how DB2 will access the data in order to resolve the
SQL statements.
o db2batch tool - provides performance information (benchmarking tool)
8.
Tuning Methodology
8.1
Tuning Approach
The objective of tuning an RDBMS is to make sure that the system is delivering good performance. As
with most things, the ability to meet that objective depends on the effort and resources you apply to the
tuning effort. The following are areas that would cause a tuning effort to be started:

Regular task - Regular periodic monitoring and tuning is standard practice. Many sites do a review
of performance on quarterly, half-yearly, or yearly intervals.

Generated warning - The automated monitoring of the system to warn that performance is
degrading and has hit some threshold.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 28 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)

Emergency - There is an emergency in performance or response time, which has been highlighted
by user feedback. The tuning must identify the problem, recommend a solution, and then work out
how to avoid this happening again.

New system - A newly built system requires initial tuning for maximum performance before going
into production. In many cases, a new system might be put in production before being optimally
tuned because of the difficultly of generating user workloads artificially and not being able to
predict real user workloads, work patterns, and data volume and distribution. For this reason, it is
critical that the design principals outlined in this document be followed, so as to minimize the
effects of the unknown as much as possible.

System change – if the change is significant, this is similar to a new system if testing cannot
effectively simulate production.
Regardless of the reason, the tuning approach will largely be the same.
8.2
Skills Needed
While this document focuses on database design and tuning, there are many other factors that influence
the overall performance of any application. Other skills that may be required tune the overall system
include:

Hardware Experts – for the various hardware platforms you plan to run on

Operating System Experts – for the various operating system platforms you plan to run on. This
could include:
o System administration experts
o Operating system performance and tuning experts

Relational Database skills including:
o DBA skills
o SQL Tuning

Middleware Experts – if using middleware products such as Websphere

Applications Experts
o Functional knowledge of the application
o Technical experts on the products used to build the application
8.3

Java

C++
General Process
The following process is recommended to improve the performance of any system:
1. Establish performance indicators.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 29 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
2. Define performance objectives.
3. Develop a performance monitoring plan.
4. Carry out the plan.
5. Analyze your measurements to determine whether you have met your objectives. If you have,
consider reducing the number of measurements you make because performance monitoring itself
uses system resources. Otherwise, continue with the next step.
6. Determine the major constraints in the system.
7. Decide where you can afford to make trade-offs and which resources can bear additional load.
(Nearly all tuning involves trade-offs among system resources and the various elements of
performance.)
8. Adjust the configuration of your system. If you think that it is feasible to change more than one
tuning option, implement one at a time. If there are no options left at any level, you have reached
the limits of your resources and need to upgrade your hardware.
9. Return to Step 4 above and continue to monitor your system.
On a periodic basis, or after significant changes to your system:

Perform the above procedure again from step 1.

Re-examine your objectives and indicators.

Refine your monitoring and tuning strategy.
8.4
DB2 Specific Tuning
There are numerous documents and books written describing detailed formal tuning methodologies. This
document is not intended to repeat, or rewrite, any of that. I will discuss some specific DB2 topics, tools,
and techniques that can be used to tune your DB2 database.
8.4.1
SQL Reviews
As mentioned earlier in this document, SQL Reviews are essential for a good performing RDBMS
system. SQL reviews in DB2 are generally done using SQL Explain facility. Explain allows you to
capture information about the access plan chosen by the optimizer as well as performance information
that helps you tune queries. Before you can capture explain information, you need to create the relational
tables in which the optimizer stores the explain information and you set the special registers that
determine what kind of explain information is captured. These tables can be created:

Automatically by the DB2 Control Center

Running the following command from a DB2 command window
o Db2 –tf EXPLAIN.DDL
(located in sqllib/misc directory)
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 30 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
DB2 provides a number of facilities to view the information generated by the EXPLAIN facility. These
include:

Visual Explain – you invoke Visual Explain from the Control Center to see a graphical display of
a query access plan. You can analyze both static and dynamic SQL statements.

db2exfmt – this command line tool is used to display explain information in preformatted output.

db2expln and dynexpln – these command line tools are use to see the access plan information
available for one or more packages of static SQL statements. Db2expln shows the actual
implementation of the chosen access plan. It does not show optimizer information. The dynexpln
tool, which uses db2expln within it, provides a quick way to explain dynamic SQL statements
that contain no parameter markers. This use of db2expln from within dynexpln is done by
transforming the input SQL statement into a static statement within a pseudo-package. When this
occurs, the information may not always be completely accurate. If complete accuracy is desired,
use the explain facility. The db2expln tool does provide a relatively compact and English-like
overview of what operations will occur at run-time by examining the actual access plan
generated.
Which tool you use depends on a number of factors including:

how complicated is the SQL

Do you have access to the actual program/code that will be running

Your familiarity with the tools
Previously, I discussed the functions of the DB2 optimizer. The DB2 optimizer uses information and
statistics in the DB2 catalog in order to determine the best access to the database based on the query
provided. This information is generally gathered using the RUNSTATS utility. The catalog information
includes the following types of information:

Number of rows in a table

Indexes defined on the table

Whether the table needs to be REORGanized
8.4.2
Database Statistics
Ideally, you would use the explain facility mentioned above against a database that has production level
volumes loaded into it. The access path chosen may be drastically different if a table has 100 rows in a
test database and 10,000,000 rows in a production database. This difference is exacerbated if the SQL
query involves a join of multiple tables.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 31 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
Often times, it is not easy or practical, particularly early on, to have a database loaded with production
level volumes. DB2 provides a facility for simulating a production database by allowing you to code
SQL UPDATE statements that operate against a set of predefined catalog views in the SYSSTAT
schema. With SYSSTAT, the database administrator is able to simulate production volumes in a test
database. Updating these views allows the optimizer to create different access plans under different
conditions. The following is a sample of one of these UPDATE statements:
UPDATE SYSSTAT.TABLES SET CARD = 850000 WHERE TABNAME =’CUSTOMER ’
Although this facility is provided, it should be used only as an initial facility for updating statistics.
There are many inter-relationships about the database tables in the catalog, and understanding how to
update all of them correctly is extremely difficult.
8.4.2.1
Using Database Statistics for SQL Access Plan Analysis
The following sources should be used in the order listed when using the EXPLAIN facility to analyze
your SQL statements. Make sure any statistics you use are current by running the RUNSTATS utility.
1. production database (or an image of one)
2. a test database loaded with a significant amount of data
3. a test database updated with production statistics
o The DB2 tool, db2look , is designed to capture all table DDL and statistics of a production
database to replicate it to the test system.
4. update the SYSSTAT views
o the UPDATE statements of the SYSSTAT views are generated by the db2look facility.
8.4.3
SNAPSHOT and EVENT Monitors
DB2 maintains data about its operation, its performance, and the applications using it. This data is
maintained as the database manager runs, and can provide important performance and troubleshooting
information. For example, you can find out:

The number of applications connected to a database, their status, and which SQL statements each
application is executing, if any.

Information that shows how well the database manager and database are configured, and helps
you to tune them.

When deadlocks occurred for a specified database, which applications were involved, and which
locks were in contention.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 32 of 35
Product Name Here
Tivoli Systems

(via Subject - File>Prop>Summary)
The list of locks held by an application or a database. If the application cannot proceed because it
is waiting for a lock, there is additional information on the lock, including which application is
holding it.
Collecting performance data introduces overhead on the operation of the database. DB2 provides
monitor switches to control which information is collected. You can turn these switches on by using the
following DB2 commands:

UPDATE MONITOR SWITCHES USING BUFFERPOOL
ON ;

UPDATE MONITOR SWITCHES USING LOCK
ON ;

UPDATE MONITOR SWITCHES USING SORT
ON ;

UPDATE MONITOR SWITCHES USING STATEMENT
ON ;

UPDATE MONITOR SWITCHES USING TABLE
ON ;

UPDATE MONITOR SWITCHES USING UOW
ON ;
You can access the data that the database manager maintains either by taking a snapshot or by using an
event monitor.
8.4.3.1
SNAPSHOTs
Use the GET SNAPSHOT command to collect status information and format the output for your use. The
information returned represents a snapshot of the database manager operational status at the time the
command was issued. There are various formats of this command that are used to obtain different kinds
of information. The specific syntax can be obtained from the DB2 Command Reference. Some of the
more useful ones are:

GET SNAPSHOT FOR DATABASE - Provides general statistics for one or more active databases
on the current database partition.

GET SNAPSHOT FOR APPLICATIONS - Provides information about one or more active
applications that are connected to a database on the current database partition.

GET SNAPSHOT FOR DATABASE MANAGER - Provides statistics for the active database
manager instance.

GET SNAPSHOT FOR LOCKS - Provides information about every lock held by one or more
applications connected to a specified database.

GET SNAPSHOT FOR BUFFERPOOLS - Provides information about buffer pool activity for the
specified database.

GET SNAPSHOT FOR DYNAMIC SQL - Returns a point-in-time picture of the contents of the
SQL statement cache for the database.
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 33 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
You can create some simple scripts and schedule them to get periodic snapshots during your test cycles.
8.4.4
DB2BATCH
A benchmark tool called db2batch is provided in the sqllib/bin subdirectory of your DB2 installation.
This tool can read SQL statements from either a flat file or standard input, dynamically describe and
prepare the statements, and return an answer set. You can specify the level of performance-related
information supplied, including the elapsed time, CPU and buffer pool usage, locking, and other statistics
collected from the database monitor. If you are timing a set of SQL statements, db2batch also summarizes
the performance results and provides both arithmetic and geometric means. For syntax and options, type
db2batch.
9.
Publications & Online Help
The following books serve as reference material.
Information Units
Contents
Administration Guide:
Performance
SC09-4821
This book contains information about how to configuring
and tuning your database environment to improve
performance.
Database Performance
Tuning on AIX
SG24-5511-01
This Redbook contains hints and tips from experts that
work on RDBMS performance every day. It also provides
introductions to general database layout concepts from a
performance point of view, design and sizing guidelines,
tuning recommendations, and performance and tuning
information for DB2 UDB, Oracle, and IBM Informix
databases.
This IBM Redbook will provide you with guidelines for
system design, database design, and application design
with DB2 UDB for AIX Version 7.1. It also discusses the
methods that are available for performance analysis and
tuning.
DB2 UDB V7.1
Performance Tuning Guide
SG24-6012
10. Sizing
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 34 of 35
Product Name Here
Tivoli Systems
(via Subject - File>Prop>Summary)
The following spreadsheets are provided in the Project Database on an “as-is” basis as samples to assists
with tablespace and index sizing, and DB2 memory utilization.
1. TBLDATA-zOS.XLS - Spreadsheet to estimate space requirements for a table on the mainframe
2. INDXDATA-zOS.XLS - Spreadsheet to estimate space requirements for an index on the
mainframe
3. TBLDATA-dist.XLS - Spreadsheet to estimate space requirements for a table in the distributed
environment
4. INDXDATA-dist.XLS - Spreadsheet to estimate space requirements for an index in the distributed
environment
5. DB2UDBMEMORY.XLS - a spreadsheet containing the estimated memory usage of the system
11. Reference
12. Addendum
IBM Confidential
27. June 2017
CMDB RR API SDD1, Ling Tai
Page 35 of 35