Download The Process of Tuning DB2 for AIX

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Commitment ordering wikipedia , lookup

Tandem Computers wikipedia , lookup

Microsoft Access wikipedia , lookup

IMDb wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Oracle Database wikipedia , lookup

Serializability wikipedia , lookup

SQL wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Functional Database Model wikipedia , lookup

Ingres (database) wikipedia , lookup

PL/SQL wikipedia , lookup

Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Versant Object Database wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

ContactPoint wikipedia , lookup

Transcript
L.L. Bean Inc.
DB2 Performance Recommendations
Version 0.1, June 28, 2017
Shelton Reese
Database Consultant
1
Table of Contents
Table of Contents.................................................................................................. 2
Introduction ........................................................................................................... 2
The Process of Tuning LL Bean’s DB2 for AIX 5.5 ............................................... 3
Operating system settings ............................................................................................... 4
Performance fundamentals.............................................................................................. 6
Updating catalog statistics .............................................................................................. 9
Monitoring and tuning database configuration parameters .......................................... 11
Buffer pool size ............................................................................................................. 11
Log buffer size (LOGBUFSZ) ...................................................................................... 14
Application heap size (APPLHEAPSZ) ....................................................................... 15
Sort heap size (SORTHEAP) and sort heap threshold (SHEAPTHRES)..................... 15
Number of agents (MAXAGENTS, NUM_POOLAGENTS and NUM_INITAGENTS)
....................................................................................................................................... 17
Locks (LOCKLIST, MAXLOCKS and LOCKTIMEOUT) ......................................... 19
Maximum number of active applications (MAXAPPLS) ............................................ 21
Number of asynchronous page cleaners (NUM_IOCLEANERS)................................ 22
Number of I/O servers (NUM_IOSERVERS) .............................................................. 24
Number of commits to group (MINCOMMIT) ............................................................ 24
AIX Disk Layout Considerations.......................................................................... 26
Conclusion .......................................................................................................... 27
Appendix: DB2_STATUS (KORN SHELL Script to Perform the Configuration
Update) ............................................................................................................... 27
Introduction
L.L.Bean, Inc. is a privately held mail-order and retail company based in Freeport, Maine,
specializing in clothing and outdoor recreation equipment. Its annual sales were USD $1.47 billion
in 2005. LL Bean’s Internet interface requires use of multiple technology layers. Keeping these
applications performing optimally is an intense challenge because of the scope and technological
diversity of the application.
Understanding the DB2 UDB architecture and all its implications is quite challenging. Moreover,
IBM provides over eighty parameters that impact the tuning of the database manager or the
underlying databases. Because the parameters interact in complex ways, it is easy to adjust a
parameter to address a particular performance issue, only to have unforeseen detrimental results
in another area of the database. At LL Bean the key problem is Database Configuration
problems causing the overall performance to be negatively affected. All numbers used in this
document were collected from the ECOMP DB2 instance. All software has been delivered via the
2
support directory on the production server.
IBM was brought in to resolve the performance issues that LL Bean was experiencing. As part of
the effort to resolve the performance problems, I installed and executed seven pieces of software
on the production “ecomp” server at LL Bean, as follows.

db2_status.ksh -d <dbname> --> this script waits for the sampling time and then gathers
all the DB2 snapshot information into the IBM snapshot tables. The script provides the
following data for reviw:

The hit ratio for the data and index pages of every bufferpool.

The hit ratio for the catalog cache.

The hit ratio for the package cache.

The number of sorts serviced and number of sorts rejected.

The average elapsed time for each sort.

The update command to update the configuration to proper settings

aix_check.ksh <dbname> . This script reads the AIX configuration and reports on any
problem that might be found with possible suggestions.
The Process of Tuning LL Bean’s DB2 for AIX 5.5
This feedback is organized in three sections of this document, as shown below. Within these
sections, I will provide the approach I used to identify problems and the actions I suggest for
solving them.
Fundamentals to improve LL Bean’s database performance based on operating system settings,
which provides essential recommendations based on the analysis of documents and snapshots
taken 11/11/07 and 11/14/07. All documentation is stored on the LL Bean ~/support/snapout
directory. Based on runtime timestamp for reference.
Updating catalog statistics, which emphasizes the importance of collecting and maintaining up-todate database statistics, the lack of which is often the source of many of the performance
problems.
Monitoring and tuning LL Bean’s database configuration parameters, which describes a list of
database manager parameters and database parameters in the order of importance. Normally it's
not necessary to go through the whole list to achieve the performance goals. I only recommend
that several of them be adjusted to gain performance improvement.
With these procedures, IBM can get LL Bean’s Internet application up and running with verifiable
and reasonably good performance.
3
Operating system settings
Recommendation
The following settings are recommended to allow maximum resource usage of AIX by DB2.
CPU, Memory & Kernel
ulimit
'ulimit' defines file descriptors limits. It specifies the number of open files permitted. If this value is
set too low, a memory allocation error will occur on AIX and a too many files open will be logged
to the to the stderr log file. Set this value higher than the default system value. For large multiprocessor machines, set to unlimited.
How to change the parameter
# ulimit -a // to query
# ulimit -n <new setting> // to set to a new value
ulimit -n -1 // to set it to unlimited
Maximum Number of Processes Per User
'maxuproc' defines the max number of processes each AIX user can 'fork'. If it is set too
low, you may not be able to create high number of database agents necessary to handle
connections from your application. So it is recommended to set this to at least 4096.
How to change the parameter
Set the AIX maxuproc (maximum number of processes per user) device attribute to 4096 by
entering the command:
chdev -l sys0 -a maxuproc=’4096’
Investigative Steps
The value set by AIX (1024) has been found to be too low for some large scale database
environments, which causes DB2 to generate an SQL error message "SQL1402N - Unable to
authenticate user due to unexpected system error."
To check the value run:
lsattr -l sys0 -E | grep maxuproc
Shared Memory Segments
The 'EXTSHM' environment variable defines the max number of memory segments shared by all
user-mode applications. Some database servers, such as IBM DB2, relies on this environment
variable to support large workloads. To set it, add the following lines to a user's profile:
4
EXTSHM=ON
export EXTHSM
Network
In high volume environment, TCP layer on AIX should be tuned to the similar values listed below:
Investigative Steps
To display a list of current tunable values, use:
/usr/sbin/no -a
To make TCP changes permanent, add the following lines into the /etc/rc.net file:
/usr/sbin/no -o sb_max=6192000
/usr/sbin/no -o tcp_sendspace=4096000
/usr/sbin/no -o tcp_recvspace=4096000
/usr/sbin/no -o udp_sendspace=65536
/usr/sbin/no -o udp_recvspace=655360
/usr/sbin/no -o rfc1323=1
/usr/sbin/no -o ipqmaxlen=150
/usr/sbin/no -o clean_partial_conns=true
Ethernet Device Driver Properties
To display a list of current ethernet adapter device driver settings such as MTU, IPs, etc., use:
lsattr -E -l en2
where en2 is the adapter name - chances are you have several, en0, en1, etc.
To change the adapter network settings, one way is to run the command line tool chdev:
# ifconfig en0 detach
chdev -l ent0 -a tx_que_size=8192
# ifconfig en0 hostname up
where tx_que_size is 8192
5
Disk I/O
Tune FILE SYSTEM Performance
The following file system mount option settings allow you to take advantage of caching,
throughput optimization, and performance of UNIX attached external storage device, like a SAN.
Update the /etc/filesystems, and modify the wanted file system section(s) to use 'dio':
/mydb2:
dev = /dev/fwvol
vfs = jfs2
log = vjfsxxx
mount = true
options = dio,rw
account = false
Enabling file system direct I/O, on AIX, allows the file system buffer cache to be bypassed which
eliminates the double buffering of data which can adversely affect file system I/O performance.
For changes to take effect, you may run 'mount -a', if the file system you are modifying is not
mounted. If the file system is mounted it will need to be unmounted and then remounted or a
system reboot will need to be executed.
Performance fundamentals
Following are the findings of my analysis regarding the state of LL Bean’s system and the
parameters that are affecting performance.
1. There is enough memory at the current amount of 45 GB.
o
ECOMP a 64-bit system, uses at least 3.7 GB of RAM per CPU, up to 8 GB per
machine, to support the buffer pools, DB2 agents, and other shared memory
objects required for a large number of concurrent users. (See Buffer Pool Size
(BUFFPAGE) later in this document for more information on buffer pools.)
o
In the VMSTAT Report there are three columns of particular interest. The key
figures for CPU utilization (in percentage) are shown VMSTAT entitled us, sy,
and id, with "id" being the percentage of CPU that is idle. A value below 10% is
great. For LL Bean’s multiple CPUs, If this average is over 20% then the db2
6
configuration would DBHEAP would be updated because memory may be
needed to support applications that run locally without stored procedures.
o
The amount of memory required by each user connection into the database (that
is, a DB2 agent) depends on the nature of the SQL statements performed by the
LL Bean application -- such as the number of concurrent cursors opened and the
amount of sorting and temp space required. For the LL Bean application, there
should be less sorting and temp space required and only a handful of concurrent
cursors opened at a time.
Rule of thumb: Use a minimum of 1 MB for AIX for each DB2 agent. If fenced
stored procedures are used, then each user connection has two DB2 agents, in
addition to the memory required to run the stored procedure application.
2. LL Bean has sufficient I/O handling capability.
o
There must be enough disk arms to ensure sufficient I/O parallelism to support a
high volume of concurrent transactions. There should be at least 5 to 10 disks
per CPU for a moderate workload, and 20 disks for a high-I/O OLTP workload.
The operating system (including paging space), DB2 logs, and DB2 table spaces
should each have their own dedicated disks. There should be multiple disks for
the DB2 logs, for tables, and for indexes. Use the AIX_CHECK script to check
how the disk bandwidth is doing.
o
The proper way to estimate the I/O handling capability that is needed for good
performance is actually to prototype the transactions and find out how many I/Os
are required per transaction, and how many transactions per second are
required. Then find out the I/O rate for the disk controller and the disk subsystem
to help determine how many controllers and disks are required.
.
Perhaps the most two most important columns in the AIX_CHECK report are the
%tm_act and %tps columns. %tps is the percentage of time spent waiting for
transactions to complete, and is a measure of how many jobs are contending to
use the hard disk (eg. high is bad), while %tm_act is the percent of time the hard
disk is busy (eg. one high device relative to the other devices is bad). At LL
Bean “hdisk 35” had reports above 70%
3. There is sufficient network bandwidth.
There must be enough network bandwidth to support the workload. Make sure that the
network or any intermediate hubs are not a bottleneck. This is especially significant when
supporting access from remote location. For example, a T1 line supports 1.544 Mbit/sec,
which is only 0.193 MB/sec, whereas a typical 10 Mbit/sec Ethernet LAN can support 6x
the throughput at 1.25 MB/sec. I used the command netstat to monitor the traffic volumes
on the connections. I am not an expert in all the AIX parameters, but a quick look at the
error parameters such as Collision Errors gives me an idea of the overall health of the LL
Bean network. From the NMON I suggested a number of TCPIP changes.
4. Index your table columns appropriately.
o
Ensure that columns that are joined in queries have indexes.
o
It can improve performance if columns involved in ORDER BY and GROUP BY
are indexed.
7
o
Frequently accessed data can also be included within an index as INCLUDED
columns.
o
Use the Index Advisor (also known as Index Wizard from the DB2 Control
Center) to help determine a good set of indexes to use, based on the tables and
SQL statements that you use.
5. Ensure that your application holds locks for the shortest time possible.
o
When a user operation involves multiple interactions, each interaction should
have its own transaction to commit and should free up all locks before returning
activity to the user. Keep the duration of a transaction as short as possible by
starting its first SQL statement (which starts a transaction) as late as possible,
and its updates (inserts, updates, and deletes, which use exclusive locks) as
close to the commit stage as possible.
o
Use of the DB2 registry parameter DB2_RR_TO_RS can improve concurrency
by not locking the next key of the row that was inserted or updated. This can be
used if the isolation level RR (Repeatable Read) is not used by any programs
that operate on the same set of tables. Use DB2 Snapshot to monitor the number
of deadlocks and lock waits. Use the DB2_STATUS.KSH script to keep up to
date in this area.
6. Use stored procedures or compound SQL to minimize the network cost.
o
Minimizing the number of network trips for your SQL statements will save on
network latency and context switches, which can result in the application holding
onto locks for a shorter period of time. Generally, a stored procedure should be
used when an SQL transaction has more than 4 or 5 statements.
o
On the other hand, if there is some complicated CPU-intensive processing
involved in the application logic, leaving this in a stored procedure running on the
database server can use up excessive CPU cycles on the database server at the
expense of database operations. In this case, either do not use a stored
procedure, or execute part of the logic in the client side and execute the rest in a
stored procedure.
7. Use SQL efficiently.
o
In general, don't use multiple SQL statements where one will do. When you
provide more detailed search conditions by having more predicates in a query,
the optimizer has a chance to make better choices. You should also be selective
in your query so that the database does not return more rows and columns than
you need. For example, use SQL to filter the rows you want; don't return all the
rows and then require the application to do the filtering.
8. Analyze the access plan.
o
Use Visual Explain or db2exfmt to analyze each SQL statement. Make sure
appropriate indexes are used to minimize the rows that have to be fetched
internally when selecting and joining tables.
8
Updating catalog statistics
Background
The RUNSTATS utility updates statistics in the system catalog tables to help with the query
optimization process. Without these statistics, the database manager could make a decision that
would adversely affect the performance of an SQL statement. The RUNSTATS utility allows you
to collect statistics on the data contained in the tables, indexes, or both tables and indexes. Use
the RUNSTATS utility to collect statistics based on both the table and the index data to provide
accurate information to the access plan selection process in the following situations:







When a table has been loaded with data, and the appropriate indexes have been
created.
When a table has been reorganized with the REORG utility.
When there have been extensive updates, deletions, and insertions that affect a table
and its indexes. ("Extensive" in this case may mean that 10 to 20 percent of the table and
index data has been affected.)
Before binding application programs whose performance is critical.
When you want to compare new statistics with previous statistics. Running statistics on a
periodic basis enables you to discover performance problems at an early stage.
When the prefetch quantity is changed.
When you have used the REDISTRIBUTE NODEGROUP utility.
When optimizing SQL queries, the decisions made by the SQL compiler are heavily influenced by
the optimizer's model of the database contents. This data model is used by the optimizer to
estimate the costs of alternative access paths that can be used to resolve a particular query. A
key element in the data model is the set of statistics gathered about the data contained in the
database and stored in the system catalog tables. This includes statistics for tables, nicknames,
indexes, columns, and user-defined functions (UDFs). A change in the data statistics can result in
a change in the choice of access plan selected as the most efficient method of accessing the
desired data.
Examples of the statistics available which help define the data model to the optimizer include:









The number of pages in a table and the number of pages that are not empty.
The degree to which rows have been moved from their original page to other (overflow)
pages.
The number of rows in a table.
Statistics about individual columns such as the number of distinct values in a column.
The degree of clustering of an index; that is, the extent to which the physical sequence of
rows in a table follows an index.
Statistics about the index such as the number of index levels and the number of leaf
pages in each index.
The number of occurrences of frequently used column values.
The distribution of column values across the range of values present in the column.
Cost estimates for user-defined functions (UDFs) in this case you don’t have them.
RUNSTATS can help you determine how performance is related to changes in your database.
The statistics show the data distribution within a table. When used routinely, RUNSTATS
provides data about tables and indexes over a period of time, thereby allowing performance
9
trends to be identified for your data model as it evolves over time. Rebind applications that use
static SQL after using RUNSTATS so that the query optimizer can choose the best access plan
given the new statistics. However, for applications using dynamic SQL (e.g. most vendor
applications) rebinding is not necessary since the statement will be optimized based on the
statistics at run time. When statistical information about tables is not accurate, it may cause
performance problems. In a worst-case scenario, a particular SQL statement may cause DB2 to
use a table scan instead of an index scan. The statistics will be updated in the db2service
program included with the weekend maintenance scripts.
How to update the statistics
Statistics for objects are updated in the system catalog tables only when explicitly requested.
There are several ways to update some or all of the statistics:




Using the RUNSTATS (run statistics) utility.
Using LOAD, with statistics collection options specified.
Coding SQL UPDATE statements that operate against a set of predefined catalog views.
Using the "reorgchk update statistics" command.
When you do not exactly know all the table names, or there are too many, the easiest way to do
RUNSTATS is to use the "db2 reorgchk update statistics" command. The exact script looks like
this with ECOMP substituted for DB_NAME:
#!/bin/ksh
for i in `db2 list db directory|grep -B 4 -i Indirect|grep
name|grep Database| awk '{print $4}'`
do
echo 'DBname = ' $i
DBname=$i
db2 connect to $i
db2 -x "select rtrim(TABSCHEMA) || '.' || TABNAME from
syscat.tables where stats_time < (current timestamp - 7 days) and
type='T'" > runstats.temp
if [ "$?" -eq 1 ];then
echo "No tables for database $DBname"
continue
fi
while read line_by_line
do
db2 "reorgchk update statistics on table ${A}"
db2 commit
done < runstats.temp
rm -f runstats.temp
db2rbind $DBname -l bind.log
done
echo Runstats ended at `date`
The script above does not require table names. This one command performs RUNSTATS on all
tables.
Note: If you know the name of the table and to avoid having large numbers of tables that may
take a long time to complete, it's preferable to do RUNSTATS on each table one at a time. The
command looks like the following:
10
db2 -v runstats on table TAB_NAME and indexes all
This will collect statistics by table and all indexes (basic level).
Checking to see if RUNSTATS has been run
One quick way to see whether RUNSTATS has been performed on your database is to query
some system catalog tables. For example, as shown in the script above, you can run this
command:
db2 -v "select tbname, nleaf, nlevels, stats_time from
sysibm.sysindexes"
If RUNSTATS has not yet been run, you will see "-1" for the nleaf and nlevels columns, and a "-"
for the stats_time column. These columns contain real numbers if RUNSTATS has been run, and
the stats_time column will contain the timestamp when RUNSTATS ran. If you think the time
shown in stats_time is too old, it's time to do runstats again. Based on the analysis for runstats
attached to this document, I recommend a full runstats be done using the script provided above.
Monitoring and tuning database configuration parameters
The following database configuration tuning steps were used in the LL Bean environment to
achieve reasonably good performance and at the same time enable LL Bean to avoid obvious
pitfalls. Among the configuration parameters, database manager configuration parameters require
a restart of the database manager, and most database configuration parameters require the
application to reconnect to the database in order to have the changes take effect. The
configuration parameters I recommend changing are described here:










Buffer pool size
Log buffer size
Application heap size
Sort heap size and sort heap threshold
Number of agents
Locks
Maximum number of active applications
Number of asynchronous page cleaners
Number of I/O servers
Number of commits to group
Buffer pool size
Background
A buffer pool is an area of storage in memory into which database pages (containing table rows
or index entries) are temporarily read and changed. The purpose of the buffer pool is to improve
database system performance. Data can be accessed much faster from memory than from a
disk. Therefore, the fewer times the database manager needs to read from or write to a disk, the
better the performance. The configuration of one or more buffer pools is the single most important
tuning area, since it is here that most of the data manipulation takes place for applications
connected to the database (excluding large objects and long field data).
11
By default, applications use the buffer pool called IBMDEFAULTBP, which is created when the
database is created. The DB2 database configuration parameter BUFFPAGE controls the size of
a buffer pool when the value of NPAGES is -1 for that buffer pool in the
SYSCAT.BUFFERPOOLS catalog table. Otherwise the BUFFPAGE parameter is ignored, and
the buffer pool is created with the number of pages specified by the NPAGES parameter. In LL
Bean’s case there are 5 buffer pools defined and in use.
Recommendations
For applications that only use one buffer pool such as at LL Bean, change NPAGES to -1 so that
BUFFPAGE controls the size of the buffer pool. This makes it easier to update and report the
buffer pool size along with other DB2 database configuration parameters.
After making sure that you can use the BUFFPAGE parameter in the database configuration to
control the buffer pool size, set it to a proper value. Setting it to a reasonably large value is a safe
thing based on the size of your database and the nature of your application. Usually, the default
value of this parameter is very small and may not be satisfactory. Consider the following:

As a starting point, and if you have enough memory on your machine, set BUFFPAGE to
40,000 pages (160 MB), or 10% of the total memory on your machine.

For LL Bean’s large Ecomp database, set aside as much as memory as possible for the
buffer pool while keeping the system stable. As a starting point, try 1.6 GB and then
experiment with more.
How to change the parameter
Run the following script to:
1. Verify the catalog value
2. Enable the use of the database configuration parameter BUFFPAGE
3. Update the value of BUFFPAGE for all databases.
db2
db2
db2
db2
db2
db2
-v
-v
-v
-v
-v
-v
connect to DB_NAME
select * from syscat.bufferpools
alter bufferpool IBMDEFAULTBP size -1
connect reset
update db cfg for dbname using BUFFPAGE bigger_value
terminate
Investigative steps
To determine whether the BUFFPAGE parameter is in use for buffer pool size of database, run:
db2
db2
db2
db2
-v
-v
-v
-v
connect to DB_NAME
SELECT * from SYSCAT.BUFFERPOOLS
connect reset
terminate
Examine the results. If each buffer pool has an NPAGES value of -1, then the buffer pool size is
being controlled through the BUFFPAGE parameter in the database configuration.
12
To determine whether the database buffer pool size is big enough, collect snapshots for the
database and/or buffer pool while running the application. A script similar to the following will give
you the needed information:
db2 -v update monitor switches using bufferpool on
db2 -v get monitor switches
db2 -v reset monitor all
-- run the Java application -./db2_status.ksh -d ecompl
db2 -v terminate
Make sure that you issue the "db2 -v get snapshot" before you lose your database connection.
When the last application disconnects from the database, the database terminates and all
snapshot statistics will be lost. To ensure there is always a connection that keeps the database
up, use one of the following methods:

Maintain one separate connection in the window where you are collecting snapshots.

Use the DB2 ACTIVATE DATABASE command.
In the snapshot output, either from the database snapshot or buffer pool snapshot, look for the
following "logical reads" and "physical reads" so that you can calculate the buffer pool hit ratio,
which can help you tune your buffer pools:
MONITORING THE BUFFER POOLS
Buffer Pool Hit Rate - the higher, the better
=============================================
Buffer Pool Name
Buffer Pool Hit
Rate (higher is
better)
Buffer pool
data logical
reads
Buffer pool
data physical
reads
The buffer pool hit ratio indicates the percentage of time that the database manager did not need
to load a page from disk in order to service a page request; that is, the page was already in the
buffer pool. The greater the buffer pool hit ratio, the lower the frequency of disk I/O. The 32K
buffer pool will need to be updated after a full table reorganization is done. To calculate the
buffer pool hit ratio, the formula is as follows:
(1 - ((buffer pool data physical reads + buffer pool index
physical reads) /
(buffer pool data logical reads + pool index logical reads))) *
100%
This calculation takes into account all of the pages (index and data) that are cached by the buffer
pool. Ideally this ratio should be over 95%, and as close to 100% as possible. To increase the
buffer pool hit ratio, try the following:
13

Increase the buffer pool size.

Consider allocating multiple buffer pools, possibly one for each frequently-accessed large
table with its own table space, and one for a group of small tables, as is done here and
the next step is to adjust the sizes of buffer pools to see which combination provides the
best performance.
Avoid over-allocating memory to buffer pools if the memory allocated cannot help performance.
The buffer pool sizes should be determined based on snapshot information taken from the test
environment.
Log buffer size (LOGBUFSZ)
Background
LOGBUFSZ is a database configuration parameter. It is the parameter for the log buffer. It allows
you to specify the amount of database shared memory to use as a buffer for log records before
writing these records to disk. The log records are written to disk when one of the following events
occurs:



A transaction commits.
The log buffer is full.
As a result of some other internal database manager event.
Buffering the log records results in more efficient log file I/O, because the log records are written
to disk less frequently and more log records are written each time. Increase the size of this buffer
area if there is considerable read activity on a dedicated log disk, or if there is high disk utilization.
When increasing the value of this parameter, consider the DBHEAP parameter, too, because the
log buffer area uses space controlled by the DBHEAP parameter.
How to change the parameter
LL Bean is using the default value for this parameter, 8 (4KB pages), and that is not big enough
for a LL Bean database. The optimal value for LOGBUFSZ is 128, or 256 4KB pages. For
example, you can use the command below to change it:
db2 -v update database cfg for DB_NAME using LOGBUFSZ <value>
db2 -v terminate
Investigative steps
Use the database snapshot to determine whether the LOGBUFSZ parameter is optimal or not by
looking at the lines shown in the following example from the LL Bean production server:
Log pages read
Log pages written
Maximum total log space used (Bytes)
= 484961
= 21963890
= 8,024,082,185 (8 GB)
In general, the ratio between "log pages read" and "log pages written" should be as small as
possible. An ideal value would be zero log pages read while seeing a good number of log pages
14
written. When there are too many log pages read, it means a bigger LOGBUFSZ is needed. The
current LOGBUFSZ is 8.
Application heap size (APPLHEAPSZ)
Background
APPHEAPSZ is a database configuration parameter that defines the number of private memory
pages available to be used by the database manager on behalf of a specific agent or subagent.
The heap is allocated when an agent or subagent is initialized for an application. The amount
allocated is the minimum amount needed to process the request given to the agent or subagent.
As the agent or subagent requires more heap space to process larger SQL statements, the
database manager will allocate memory as needed, up to the maximum specified by this
parameter.
How to change the parameter
Here is the command to change the default value 128 4KB pages to the optimal value:
db2 -v update db cfg for DB_NAME using applheapsz <new value>
db2 -v terminate
Investigative steps
When your application receives an error indicating that there is not enough storage in the
application heap, increase the value of APPLHEAPSZ for <1024>.
Sort heap size (SORTHEAP) and sort heap threshold
(SHEAPTHRES)
Background
SORTHEAP is a database configuration parameter that defines the maximum number of private
memory pages to be used for private sorts, or the maximum number of shared memory pages to
be used for shared sorts. If the sort is a private sort, then this parameter affects agent private
memory. If the sort is a shared sort, then this parameter affects the database shared memory.
Each sort has a separate sort heap that is allocated as needed, by the database manager. This
sort heap is the area where data is sorted. If directed by the optimizer, a smaller sort heap than
the one specified by this parameter is allocated using information provided by the optimizer.
SHEAPTHRES is a database manager configuration parameter. Private and shared sorts use
memory from two different memory sources. The size of the shared sort memory area is statically
predetermined at the time of the first connection to a database based on the value of
SHEAPTHRES. The size of the private sort memory area is unrestricted. The SHEAPTHRES
parameter is applied differently for private and shared sorts:

For private sorts, SHEAPTHRES is an instance-wide soft limit on the total amount of
memory that can be consumed by private sorts at any given time. When the total privatesort memory consumption for an instance reaches this limit, the memory allocated for
15
additional incoming private-sort requests is considerably reduced.

For shared sorts, SHEAPTHRES is a database-wide hard limit on the total amount of
memory consumed by shared sorts at any given time. When this limit is reached, no
further shared-sort memory requests are allowed until the total shared-sort memory
consumption falls below the limit specified by SHEAPTHRES.
Examples of operations that use the sort heap include hash joins and operations where the table
is in memory. Explicit definition of the threshold prevents the database manager from using
excessive amounts of memory for large numbers of sorts.
Recommendations

Use the database system monitor to track sort activity.

Use appropriate indexes to minimize the use of the sort heap.

When frequent large sorts are required, increase the value of SORTHEAP.

If you increase SORTHEAP, determine whether the SHEAPTHRES parameter in the
database manager configuration file also needs to be adjusted.

The sort heap size is used by the optimizer in determining access paths. Consider
rebinding applications (using the REBIND PACKAGE command) after changing this
parameter.

Ideally, you should set the sort heap threshold (SHEAPTHRES) parameter to a
reasonable multiple of the largest SORTHEAP parameter you have in your database
manager instance. This parameter should be at least two times the largest SORTHEAP
defined for any database within the instance. Use the DB2_STATUS.KSH script for help
with this.
How to change the parameters
To change the values of SORTHEAP and SHEAPTHRES, run the following commands:
-- SORTHEAP should be changed for individual database -db2 -v update db cfg for DB_NAME using SORTHEAP a_value
-- SHEAPTHRES is a database manager parameter -db2 -v update dbm cfg using SHEAPTHRES b_value
db2 -v terminate
Investigative steps
Your LL Bean application should not be performing large sorts. They are too costly in terms of
CPU and I/O resource. Usually, the default value for SORTHEAP size (256 4KB pages) is
adequate. In fact, for high concurrency OLTP, you may want to decrease this value from the
default. When further investigation is needed, you can issue the following command:
db2 -v update monitor switches using sort on
16
Then, let your application run for a while, and type:
db2 -v get snapshot for database on DBNAME
Look at the output in the following example:
Active Sorts
Total Sort Heap
Total sort time
Total Sorts
Average elapsed
Sorts requiring
Allocated
(ms)
=
=
=
=
3
276
78818393
23203178
time per sort = 3.39 ms
merge phases = 0.17 %
i.e. started as piped sort, but required more memory than allocated.
Post Threshold Sorts
Piped Sorts Requested
Piped Sorts Accepted
Piped Sorts Serviced = 100.00 %
= 0
= 0
= 0
Commit statements attempted
Rollback statements attempted
Dynamic statements attempted
Static statements attempted
=
=
=
=
210927
905
93416272
284145
From this, you can calculate the number of sorts per transaction and the percentage of sorts that
overflowed the memory that was available to them.
SortsPerTransaction 109
= (Total Sorts) / (Commit statements attempted + Rollback
statements attempted)
PercentSortOverflow 1.4
= (Sort overflows * 100 ) / (Total sorts)
Rule of thumb: If SortsPerTransaction is greater than 5, it might indicate there are too many
sorts per transaction. If PercentSortOverflow is greater than 3 percent, there may be serious and
unexpected large sorts occurring. When this happens, increasing SORTHEAP just hides the
performance problem--it does not fix it. The true solution to this problem is to improve the access
plan for problematic SQL statements by adding the correct indexes.
Number of agents (MAXAGENTS, NUM_POOLAGENTS and
NUM_INITAGENTS)
Background
These are database manager configuration parameters.

The MAXAGENTS parameter indicates the maximum number of database manager
agents that are available at any given time to accept application requests. The value of
MAXAGENTS should be at least the sum of the values for MAXAPPLS (maximum
17
concurrent applications) in each database to be accessed concurrently. If the number of
databases is greater than the NUMDB parameter, then the safest course is to use the
product of NUMDB with the largest value for MAXAPPLS. Each additional agent requires
some resource overhead that is allocated at the time the database manager is started.

The NUM_POOLAGENTS parameter is a guideline for how large you want the agent
pool to grow. If more agents are created than is indicated by the value of this parameter,
they will be terminated when they finish executing their current request, rather than be
returned to the pool. If the value for this parameter is 0, agents will be created as needed,
and may be terminated when they finish executing their current request.
To avoid the costs associated with the frequent creation and termination of agents in an
OLTP environment in which many applications are concurrently connected, increase the
value of NUM_POOLAGENTS to be closer to the value of MAXAGENTS.

The NUM_INITAGENTS parameter determines the initial number of idle agents that are
created in the agent pool at DB2START time. Specifying a sizable number of initial
agents, while not necessary, can accelerate the warming-up period.
Recommendations
In most cases, set MAXAGENTS and NUM_POOLAGENTS to a value that slightly exceeds the
maximum expected number of concurrent application connections. This would mean over 300.
Leaving NUM_INITAGENTS as the default should be fine. We used 64 as a start.
How to change the parameter
In order to change these parameters, run the following commands:
db2
db2
db2
db2
-v
-v
-v
-v
update dbm cfg using MAXAGENTS a_value
update dbm cfg using NUM_POOLAGENTS b_value
update dbm cfg using NUM_INITAGENTS c_value
terminate
Investigative steps
Anytime during a run, you can use the following command to get the snapshot data for database
manager:
db2 -v get snapshot for database manager
and look for the following lines of output:
High water mark for agents registered
= 125
High water mark for agents waiting for a token = 0
Agents registered
= 114
Agents waiting for a token
= 0
Idle agents
= 70
If you find that either "Agents waiting for a token" or "Agents stolen from another application" is
not equal to 0, you may need to increase MAXAGENTS to allow more agents to be available to
18
the database manager. There is no need to change the MAXAGENTS. The DB2_STATUS will
guide the way.
Locks (LOCKLIST, MAXLOCKS and LOCKTIMEOUT)
Background
These lock-related controls are database configuration parameters:

LOCKLIST indicates the amount of storage that is allocated to the lock list. There is one
lock list per database, and it contains the locks held by all applications concurrently
connected to the database. Locking is the mechanism that the database manager uses to
control concurrent access to data in the database by multiple applications. Both rows and
tables can be locked. Each lock requires 32 or 64 bytes of the lock list, depending on
whether or not other locks are held on the object:
o
o

64 bytes are required to hold a lock on an object that has no other locks held on
it.
32 bytes are required to record a lock on an object that has an existing lock held
on it.
MAXLOCKS defines a percentage of the lock list held by an application that must be filled
before the database manager performs lock escalation. When the percentage of the lock
list used by one application reaches MAXLOCKS, the database manager escalates the
locks, which means it replaces row locks with table locks, thereby reducing the number of
locks in the list. When the number of locks held by any one application reaches this
percentage of the total lock list size, lock escalation occurs for the locks held by that
application. Lock escalation also occurs if the lock list runs out of space. The database
manager determines which locks to escalate by looking through the lock list for the
application and finding the table with the most row locks. If after replacing these with a
single table lock, the MAXLOCKS value is no longer exceeded, lock escalation stops. If
not, lock escalation continues until the percentage of the lock list held is below the value
of MAXLOCKS. The MAXLOCKS parameter multiplied by the MAXAPPLS parameter
cannot be less than 100.
Although the escalation process itself does not take much time, locking entire tables
(versus individual rows) decreases concurrency, and overall database performance may
decrease for subsequent accesses against the affected tables.
Suggestions for controlling the size of the lock list include:
o
o
o
o
Commit frequently to release locks.
When performing many updates, lock the entire table for the duration of the
transaction before updating (using the SQL LOCK TABLE statement). This uses
only one lock and keeps others from interfering with the updates, but it does
reduce concurrency of the data to other users.
Use the LOCKSIZE parameter of the ALTER TABLE statement to control how
locking is done for a specific table on a permanent basis.
Examine the isolation level used for the application. Using the Repeatable Read
isolation level may result in an automatic table lock in some cases. Use the
Cursor Stability isolation level when possible to decrease the number of share
19
locks held. If application integrity requirements are not compromised, use
Uncommitted Read instead of Cursor Stability to further decrease the amount of
locking.
Use the following steps to determine the number of pages required for your lock list:
1.
Calculate a lower bound for the size of your lock list: (512 * 32 * MAXAPPLS) / 4096,
where 512 is an estimate of the average number of locks per application and 32 is
the number of bytes required for each lock against an object that has an existing lock.
2.
Calculate an upper bound for the size of your lock list: (512 * 64 * MAXAPPLS) /
4096, where 64 is the number of bytes required for the first lock against an object.
3.
Estimate the amount of concurrency you will have against your data and, based on
your expectations, choose an initial value for lock list that falls between the upper and
lower bounds that you have calculated.
Use the database system monitor to tune the MAXLOCKS value.
When setting MAXLOCKS, consider the size of the lock list (LOCKLIST):
MAXLOCKS = 100 * (512 locks per application * 32 bytes per lock * 2) / (LOCKLIST *
4096 bytes)
This sample formula allows any application to hold twice the average number of
locks. You can increase MAXLOCKS if only a few applications run concurrently,
because there will not be a lot of contention for the lock list space under these
conditions.
4.
LOCKTIMEOUT specifies the number of seconds that an application will wait to
obtain a lock. This helps avoid global deadlocks for applications.

If you set this parameter to 0, the application will not wait for locks. In this
situation, if no lock is available at the time of the request, the application
immediately receives a -911.

If you set this parameter to -1, lock timeout detection is turned off. In this
situation, the application will wait for a lock (if one is not available at the
time of the request) until either the lock is granted or until a deadlock
occurs.
Recommendation
Set LOCKTIMEOUT to quickly detect waits that are occurring because of an abnormal situation,
such as a transaction that is stalled (possibly as a result of a user leaving their workstation.) Set it
high enough so that valid lock requests do not time-out because of peak workloads, during which
time there is an increased wait for locks.
In an online transaction processing (OLTP) environment, start with a value of 30 seconds. In a
query-only environment you could start with a higher value. In either case, use benchmarking
techniques to tune this parameter.
How to change the parameters
20
To change the lock parameters, run the following commands:
db2
db2
db2
db2
-v
-v
-v
-v
update db cfg for DB_NAME using LOCKLIST a_number
update db cfg for DB_NAME using MAXLOCKS b_number
update db cfg for DB_NAME using LOCKTIMEOUT c_number
terminate
Investigative steps
Once the lock list is full, performance can degrade because lock escalation generates more table
locks and fewer row locks, thus reducing concurrency on shared objects in the database.
Additionally, there may be more deadlocks between applications (because they are all waiting on
a limited number of table locks), which will result in transactions being rolled back. Your
application will receive an SQLCODE of -912 when the maximum number of lock requests has
been reached for the database. If lock escalations are causing performance concerns you may
need to increase the value of LOCKLIST parameter or the MAXLOCKS parameter. You may use
the database system monitor to determine if lock escalations are occurring, to track the number of
times an application (connection) experienced a lock timeout, or that a database detected a
timeout situation for all applications that were connected.
1. First, run the following command to turn on the DB2 monitor for locks:
 db2 -v update monitor switches using lock on
 db2 -v terminate
2. Then collect your snapshots for the database:
 db2 -v get snapshot for database on DB_NAME
3. In the snapshot output, examine the following items:










Locks held currently = 0
Lock waits = 0
Time database waited on locks (ms) = 25047
Lock list memory in use (Bytes) = 1824552
Deadlocks detected = 0
Lock escalations = 526
Exclusive lock escalations = 526
Agents currently waiting on locks = 0
Lock Timeouts = 120
Internal rollbacks due to deadlock = 0
If the "Lock list memory in use (Bytes)" exceeds 50 percent of the defined LOCKLIST size, then
increase the number of 4KB pages in the LOCKLIST database configuration parameter. The lock
escalations lock timeouts and deadlocks does indicate some potential problems in your system or
application. The locking problems normally indicate some fairly significant concurrency problems
in the application that should be dealt with before the lock list parameter is increased.
Maximum number of active applications (MAXAPPLS)
Background
21
MAXAPPLS is a database configuration parameter. It specifies the maximum number of
concurrent applications (both local and remote) that can be connected to a database. Because
each application that attaches to a database requires some private memory to be allocated,
allowing a larger number of concurrent applications will use more memory. The value of this
parameter must be equal to or greater than the sum of the connected applications, plus the
number of these same applications that may be concurrently in the process of completing a twophase commit or rollback.
Recommendations
To run the LL Bean Internet application, make sure that MAXAPPLS is set to the right value
(large enough but not unnecessarily large) to accommodate the maximum concurrent
users/connections. For those applications that use connection pooling, I suggest setting
MAXAPPLS to the connection pool size plus one or two (just in case you need to invoke
command line connection to do something at the same time).
How to change the parameter
To change the value of MAXAPPLS, run the following command:
db2 -v update db cfg for DB_NAME using MAXAPPLS a_number
db2 -v terminate
Investigative steps
When an application attempts to connect to a database, but the value of MAXAPPLS has already
been reached, the following error is returned to the application indicating that the maximum
number of applications have been connected to the database. The current setting is 1000.
SQL1040N The maximum number of applications is already connected
to the database. SQLSTATE=57030
Number of asynchronous page cleaners (NUM_IOCLEANERS)
Background
NUM_IOCLEANERS is a database configuration parameter that lets you specify the number of
asynchronous page cleaners for a database. These page cleaners write changed pages from the
buffer pool to disk before the space in the buffer pool is required by a database agent. This allows
the agents to read new pages without having to wait for changed pages to be written out. As a
result, your application's transactions should run faster.
If you set the parameter to zero (0), no page cleaners are started and as a result, the database
agents will perform all of the page writes from the buffer pool to disk. This parameter can have a
significant performance impact on a database stored across many physical storage devices,
because in this case there is a greater likelihood that one of the devices will be idle. If no page
cleaners are configured, your applications may encounter periodic "log full" conditions.
If the applications for a database consist primarily of transactions that update data, an increase in
the number of cleaners will speed up performance. Increasing the page cleaners will also reduce
recovery time from soft failures, such as power outages, because the contents of the database on
disk will be more up-to-date at any given time.
22
Here are some factors to consider when setting the value for this parameter:
1. If transactions are run against the database, set this parameter to be between one and
the number of physical storage devices used for the database. My recommendation is to
set it at least to the number of CPUs on your system.
2. Environments with high update transaction rates may require more page cleaners to be
configured.
3. Environments with large buffer pools may also require more page cleaners to be
configured.
How to change the parameter
The following command can be used to set this parameter to a new value:
db2 -v update db cfg for DB_NAME using NUM_IOCLEANERS a_number
db2 -v terminate
Investigative steps
Use the database system monitor to help you tune this configuration parameter using information
from the snapshot data (or event monitor) about write activity from a buffer pool.
When using snapshot and collecting snapshot data for the buffer pool, monitor the following
counters:
Buffer pool data writes
Asynchronous pool data page writes
= 1254991
= 1205276
Buffer pool index writes
Asynchronous pool index page writes
= 4746571
= 4716387
LSN Gap cleaner triggers
Dirty page steal cleaner triggers
Dirty page threshold cleaner triggers
= 184582
= 138781
= 987976
How to decide whether NUM_IOCLEANERS should be reduced or increased?
Decrease NUM_IOCLEANERS if both of the following conditions are true:


"Buffer pool data writes" is approximately equal to "Asynchronous pool data page writes."
"Buffer pool index writes" is approximately equal to "Asynchronous pool index page
writes."
Increase NUM_IOCLEANERS if either of the following conditions is true:


"Buffer pool data writes" is much greater than "Asynchronous pool data page writes."
"Buffer pool index writes" is much greater than "Asynchronous pool index page writes."
Dirty page steal cleaner triggers tells the number of times a page cleaner was invoked because a
synchronous write was needed during the victim buffer replacement for the database. For a better
response time, this number should be as low as possible. With the counters shown above, you
23
can use the following formula to calculate what percentage of all cleaner invocations are
represented by this element:
Dirty page steal cleaner triggers / (Dirty page steal cleaner
triggers + Dirty page threshold cleaner triggers + LSN Gap
cleaner triggers)
If this ratio is high, it may indicate that you have too few page cleaners defined. Too few page
cleaners increases recovery time after failures. The current ratio is 0.105.
Number of I/O servers (NUM_IOSERVERS)
Background
I/O servers are used on behalf of the database agents to perform prefetch I/O and asynchronous
I/O by utilities such as backup and restore. This parameter, a database configuration parameter,
specifies the number of I/O servers for a database. No more than this number of I/Os for prefetching and utilities can be in progress for a database at any time. An I/O server waits while an
I/O operation that it initiated is in progress. Non-prefetch I/Os are scheduled directly from the
database agents and as a result are not constrained by NUM_IOSERVERS.
Recommendations
In LL Bean environment, use the default.
How to change the parameter
Use the following command to set NUM_IOSERVERS to a new value:
db2 -v update db cfg for DB_NAME using NUM_IOSERVERS a_number
db2 -v terminate
Number of commits to group (MINCOMMIT)
Background
MINCOMMIT is database configuration parameter that lets you delay the writing of log records to
disk until a minimum number of commits have been performed. This delay can help reduce the
database manager overhead associated with writing log records. This can mean improved
performance when you have multiple applications running against a database and many commits
are requested by the applications within a very short time frame. This grouping of commits will
only occur when the value of this parameter is greater than one and when the number of
applications connected to the database is greater than, or equal to, the value of this parameter.
When commit grouping is being performed, application commit requests are held until either one
second has elapsed or the number of commit requests equals the value of this parameter.
Recommendations
The default value for MINCOMMIT is 1. Increase this parameter from its default value if multiple
read/write applications typically request concurrent database commits. This will result in more
efficient logging file I/O because it will occur less frequently and write more log records each time
24
it does occur. If you believe the default value is not adequate, then it is recommended that you
start with 3, and move it up and down to see the performance impact on your workload. You
could also sample the number of transactions per second and adjust this parameter to
accommodate the peak number of transactions per second (or some large percentage of it).
Accommodating peak activity minimizes the overhead of writing log records during heavy load
periods.
If you increase MINCOMMIT, you may also need to increase the LOGBUFSZ parameter to avoid
having a full log buffer force a write during these heavy load periods. In this case, the LOGBUFSZ
should be equal to:
MINCOMMIT * (log space used, on average, by a transaction)
Here is how to use the database system monitor to help you tune this parameter in the following
ways:

Calculating the peak number of transactions per second:
By taking monitor samples throughout a typical day, you can determine your heavy load
periods. One way to accomplish this is as follows:
1. At the beginning of your measurements, issue the following command:
db2 -v reset monitor for database db_name
(This will not reset the counters for high water marks.)
2. At the end of your measurements, issue the following command:
db2 -v get snapshot for database on db_name
3. Use the following output to calculate the peak number of transactions:
First database connect timestamp
=
05:29:35.658685
Last reset timestamp
=
Last backup timestamp
=
00:36:52.276504
Snapshot timestamp
=
21:38:28.104288
Commit statements attempted
=
Rollback statements attempted
=
Dynamic statements attempted
=
Static statements attempted
=
Failed statement operations
=
Log space available to the database (Bytes)=
11-12-2007
11-12-2007
11-12-2007
210927
905
93416272
284145
342903
27726997980
Let totalTransactions be the sum of “commit statements attempted” and “rollback
statements attempted.”
Let totalElapsedTime (in seconds) be the difference between “Last reset timestamp”
and “Snapshot timestamp”. Calculate the number of transactions per second as
25
NumOfTransPerSecond = totalTransactions / totalElapsedTime

Calculating the log space used per transaction:
In a similar manner, by using sampling techniques over a period of time and a number of
transactions, you can calculate an average of the log space used with the following
monitor element: log_space_used (unit of work log space used).
1. Reset the monitor for the database of interest at the beginning of the measurements
using the command:
db2 -v reset monitor for database db_name.
2. Take the snapshots at the end of the measurements using the command:
db2 -v get snapshot for database on db_name.
3. Output like that shown above is produced.
4. Calculate the log space used per transaction can be calculated using the following
formula:
LogSpaceUsedPerTrans = log_space_used / totalTransactions
How to change the parameter
Use the following command to change the MINCOMMIT value:
db2 -v update db cfg for DB_NAME using MINCOMMIT a_number
db2 -v terminate
AIX Disk Layout Considerations
The purpose of this section is to provide some considerations that are recommended to enable
either improved data access speed (striping) or high availability of data (mirroring), depending
upon which best suits LL Bean’s business purposes following the new design implementation.
From an AIX perspective, planning is important, and good disk layouts are imperative for optimum
DB2 performance. This is one of the most overlooked areas in database planning. Striping, also
known as RAID 0, is a technology that was developed to achieve maximum I/O performance. The
basic concept is that data is written and read from the file system in chunks across the width of
the physical disk layout in parallel. The width represents the number of disks within the layout,
striping represents the contiguous layout of data across separate disks. If performance is what
you have in mind, striping is what you want.
Mirroring, also known as RAID 1, is a feature on AIX that can be useful within a DB2 data layout.
The objective of mirroring is redundancy and high availability of data. Having two or more copies
of data is important from a high availability standpoint in reducing downtime. If one copy goes
bad, the system automatically starts using the other copy. If high availability of data is important,
mirroring should be your choice.
26
There are, however, tradeoffs to be considered in each layout. For instance, although striping
gives you the best performance in parallel reads and writes, failure of a single disk will render
your data useless. Although mirroring will give you data redundancy, writing of data to disk is
costly because it would mean up to three writes depending on how many copies you have
configured. But, because of the flexibility of the LVM in AIX, you could minimize the penalty factor
of each one of these configurations by tuning the following parameters:

Striping
o max_coalesce
o queue_depth
o minpgahead
o maxpgahead
o lvm_bufcnt

Mirroring
o Write scheduling policy
o Mirror write consistency
o Write verify
Conclusion
Traditional 24 x 7 performance monitors only meet the demands of today’s technology
environment from an operator’s perspective, since they are designed to provide high-level
availability information.
IBM provides the diagnostic and resolution capabilities that database administrators need
to address today’s performance and availability problems in detail. This document provides
tuning parameters in both AIX and DB2 for optimal DB2 performance. It touches on file system
layouts from an AIX perspective, and buffer pool management from a DB2 perspective. It
contains a few monitoring utilities in DB2 and AIX to aid in capturing performance data, and
recommends some specific operating system environment settings to allow for maximum
resource usage by DB2.
The attached files consisting of scripts written specifically for LL Bean provide the following: an
innovative performance interface, detailed statistics, and robust alarming capabilities, making it
possible to quickly diagnose any DB2 error. The AIX_CHECK.KSH script is the result of the
interactive analysis done by IBM for rapid problem identification.
DB2_STATUS.KSH is the second file to be used for applying the results of the analysis done
above in this document. These components allow LL Bean users to easily resolve problems
identified with the provided software. All tools and support files such as the db2support.zip are
available in the DB2 home directory of the production server.
It has been a pleasure working with Ms. Nancy Jackson and the LL Bean staff.
Appendix: DB2_STATUS (KORN SHELL Script to Perform the
Configuration Update)
#*********************************************************************#
#*(c) Copyright IBM 2007
All rights reserved.
*#
#*
*#
#*This sample program is owned by IBM and is copyrighted
*#
27
#*and licensed, not sold.
*#
#*
*#
#*You may copy, modify, and distribute this sample program in any
*#
#*form without payment to IBM, for any purpose including
*#
#*coding, marketing or distributing programs that include or are
*#
#*derivative works of the sample program.
*#
#*
*#
#*The sample program is provided to you on an "AS IS" basis, without *#
#*warranty of any kind. IBM HEREBY EXPRESSLY DISCLAIMS ALL
*#
#*WARRANTIES EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO*#
#*THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTIC-*#
#*ULAR PURPOSE. Some jurisdictions do not allow for the exclusion or *#
#*limitation of implied warranties, so the above limitations or
*#
#*exclusions may not apply to you. IBM shall not be liable for
*#
#*damages you suffer as a result of using, modifying or distributing *#
#*the sample program or its derivatives.
*#
#*
*#
#*Each copy of any portion of this sample program or any derivative *#
#*work, must include a the above copyright notice and disclaimer of *#
#*warranty.
*#
#*
*#
#*********************************************************************#
#*********************************************************************#
#* This script updates the necessary db2 Database Manager and active *#
#* database parameters it will also perform various system commands *#
#* on db2 to improve performance.
*#
#!/bin/ksh
#
#
db2_status Monitor DB2 UDB 8.1/9.1 Server and provide tuning suggestions
#
where possible/appropriate
#
#
It takes two parameters: the database name and
#
a comment to be placed on output file (optional).
#
#
Author:
Shelton Reese, DB2 Support
#
IBM, LLBean, Xtivia
#
Date:
Nov, 2007
#
#
display_syntax()
{
echo "\
SYNTAX: `basename $0` -d dbname [-c \"comment\"] [-o file ] [-v ]
Create an SQL script which can be used to restore the current settings of the
DBM CFG and DB CFG.
-c:
Comment placed on first line of Summary output file (optional)
-d:
Database name, indicates to use DB CFG (default is DBM CFG)
-v:
Verbose (default is NOT to display generated SQL script)
Notes:
Output of DB/M CFG also saved in -o directory.
Value for -d is part of output filename and generated script filename.
"
}
# Defaults
QUIET=1
dbName=""
REUSE_OUT=0
PARAMS=$*
28
INTERVAL=20
# Parse parameters
while [ "$1" != "" ]
do
case "$1" in
"-c"
) shift; comment=$1;
shift ;;
"-d"
) shift; dbName=$1;
shift ;;
"-o"
) shift; RESULTS_DIR=$1; shift ;;
"-v"
) shift; QUIET=0
;;
*
) shift; PARSE_ERROR="Invalid Param"
esac
done
;;
# Verify parameters
[ "$dbName" = "" ] && \
PARSE_ERROR="${PARSE_ERROR} -d Database is required"
if [ "$PARSE_ERROR" != "" ]
then
echo ""
echo $PARSE_ERROR
echo ""
display_syntax
exit
fi
# Shelton Reese IBM db2 support
# Script to generate snapshot and return diagnostic information
# Most of the background found in Advanced DBA Cert Guide Chapter 8
# Create unique name for snapshot and diagnostic output files.
a=$dbName.$$
snapName=snapall.$a
ts=`date +%y%m%d%H%M%S`
echo
echo "Monitoring Database $dbName for $INTERVAL seconds"
date
echo
mkdir -p snapout
# Start gathering data.
echo "" > snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo Comment: $comment >> snapout/diagout.$a
echo Timestamp yymmddHHMMSS: $ts >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "Getting snapshots. Please wait ..."
db2 get monitor switches | nawk '{
if (index($0, "Buffer Pool Activity Information") > 0) bufferpool = $7
if (index($0, "Lock Information") > 0) lock = $5
if (index($0, "Sorting Information") > 0) sort = $5
} END {
printf("BUFFERPOOL %s LOCK %s SORT %s\n", bufferpool, lock, sort)
}' > /tmp/mon_switches
db2 "update monitor switches using BUFFERPOOL on LOCK on SORT on"
db2 reset monitor all
sleep $INTERVAL
29
db2 get snapshot for database manager > snapout/$snapName
db2 get snapshot for all on $dbName >> snapout/$snapName
echo "Getting current configurations"
db2 connect to $dbName
db2 "update monitor switches using `cat /tmp/mon_switches`"
rm -f /tmp/mon_switches
echo "######################" >> snapout/diagout.$a
echo "Current db2 Server Information " >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
db2 "select substr(os_name,1,15) as \"Operating System\", substr(rtrim(os_version) ||
'.' || rtrim(os_release),1,20) as \"OS Version\", substr(host_name, 1,18) as
\"Hostname\", total_cpus, total_memory as \"Memory Megs\" from
table(env_get_sys_info()) as si" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
db2 "select case(installed_prod) when 'ADCL' then
'DB2 Application Development
Client' when 'ADMCL' then
'DB2 Administration Client' when 'CONEE' then
'DB2
Connect Enterprise Edition' when 'CONPE' then
'DB2 Connect Personal Edition' when
'CUBE' then
'DB2 Cube Views' when 'DLM' then
'DB2 Data Links Manager' when
'ESE' then
'DB2 Enterprise Server Edition' when 'EXP' then
'DB2 Express
Edition' when 'GSE' then
'DB2 Spatial Extender' when 'PE' then
'DB2 Personal
Edition' when 'QP' then
'DB2 Query Patroller' when 'RTCL' then
'DB2 Run-Time
Client' when 'WM' then
'DB2 Warehouse Manager' when 'WSE' then
'DB2 Workgroup
Server Edition' when 'WSUE' then
'DB2 Workgroup Server Unlimited Edition' else '?'
end as "Product", case(is_licensed) when 1 then 'Yes' when 0 then 'NO' end as
"Licensed", prod_release from table(env_get_prod_info()) as pi" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
db2 "select substr(inst_name,1,10) as \"Instance Name\", num_dbpartitions as \"Number
Partitions\", rtrim(char(inst_ptr_size)) ||'bit' as \"Bitness\", fixpack_num as
Fixpack from table(env_get_inst_info()) as ii" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Current instance config settings" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
db2 get dbm cfg >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Current database config settings" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
db2 get db cfg for $dbName >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Current db2set values" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2set -all >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating Snapshots for $dbName and Instance" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating ASLHEAPSZ DBM parameter setting." >> snapout/diagout.$a
echo "If many of the following lines are > 0, increase ASLHEAPSZ and restart
instance." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
30
grep -i "Rejected Block Remote Cursor requests" snapout/$snapName >>
snapout/diagout.$a 2>/dev/null
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating Sort parameter settings." >> snapout/diagout.$a
echo "If following value (from dbm snapshot) is excessive, adjust SORTHEAP and/or
SHEAPTHRES" >> snapout/diagout.$a
echo " parameters and restart instance." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select post_threshold_sorts from table(snapshot_dbm(-1)) as snapshot_dbm" >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "If Percent Piped Sorts value is low, increase the SHEAPTHRES parameter." >>
snapout/diagout.$a
db2 "select DECIMAL(((real(piped_sorts_accepted) / real(piped_sorts_requested)) *
100),18,3) as pct_piped_sorts from table(snapshot_dbm(-1)) as snapshot_dbm" >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "If Piped Sorts Rejected value is high, increase either the SORTHEAP and/or
SHEAPTHRES parameters." >> snapout/diagout.$a
db2 "select piped_sorts_requested, piped_sorts_accepted, (piped_sorts_requested piped_sorts_accepted) as piped_sorts_rejected from table(snapshot_dbm(-1)) as
snapshot_dbm" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "If Percent Post Threshold Sorts is high, increase the SHEAPTHRES parameter." >>
snapout/diagout.$a
# Not certain about this formula. It may be post_threshold_sorts/sum of total_sorts.
db2 "select DECIMAL(((real(post_threshold_sorts) / real(piped_sorts_requested)) *
100),18,3) as pct_post_threshold_sorts from table(snapshot_dbm(-1)) as snapshot_dbm"
>> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "(Informational) Total amount of private sort heap currently allocated:" >>
snapout/diagout.$a
db2 "select sort_heap_allocated from table(snapshot_dbm(-1)) as snapshot_dbm" >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "If Sort Ops per Trans or Percent Overflow Sort values are excessive, adjusting
SORTHEAP or SHEAPTHRES" >> snapout/diagout.$a
echo " may help, but complex or large SQL statements may be the cause." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select DECIMAL((real(total_sorts) / (real(commit_sql_stmts) +
real(rollback_sql_stmts))),18,3) as sort_ops_per_trans from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
db2 "select DECIMAL(((real(sort_overflows) / real(total_sorts)) * 100),18,3) as
pct_overflow_sorts from table(snapshot_database('$dbName', -1)) as snapshot_database"
>> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "The Average Elapsed Time per Sort will decrease as performance improves." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select total_sort_time, total_sorts, DECIMAL((real(total_sort_time) /
real(total_sorts)),18,3) as avg_elapsed_time_per_sort from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
31
echo "Sort List Heap defines the max no. of private memory pages to be used for" >>
snapout/diagout.$a
echo "each sort. This memory is deallocated after use. Bigger is usually better." >>
snapout/diagout.$a
echo "Sort Heap Threshold is not a hard limit, but if it is exceeded, the faster" >>
snapout/diagout.$a
echo "Piped sorts will not be performed. The threshold impacts the total amount" >>
snapout/diagout.$a
echo "of memory that can be allocated for all concurrent sort heaps." >>
snapout/diagout.$a
echo "In general, the more Sort Heap Threshold memory the better." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating Percentage of Free FCM Buffers." >> snapout/diagout.$a
echo "Only applicable if Intra-Partition Parallelism is on." >> snapout/diagout.$a
echo "If Percent Free FCM Buffers < 10%, increase FCM_NUM_BUFFERS" >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select DECIMAL(((real(buff_free_bottom) / real(buff_free)) * 100),18,3) as
pct_free_fcm_buffers from table(snapshot_fcm(-1)) as snapshot_fcm" >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating Agent Pool Size." >> snapout/diagout.$a
echo "Review the following for high agents_created_empty_pool (no agents available)
or" >> snapout/diagout.$a
echo " coord_agents_top (agents associated with stolen coord agent)" >>
snapout/diagout.$a
db2 "select agents_registered, agents_registered_top, agents_waiting_top, idle_agents,
agents_from_pool, agents_created_empty_pool, coord_agents_top, max_agent_overflows,
agents_stolen, num_gw_conn_switches from table(snapshot_dbm(-1)) as snapshot_dbm" >>
snapout/diagout.$a
echo "If Percent Agents Created is too high, increase NUM_INIT_AGENTS and review agent
parameters in the dbm" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select DECIMAL(((real(agents_created_empty_pool) / (real(agents_from_pool) +
real(agents_stolen))) * 100),18,3) as pct_agents_created from table(snapshot_dbm(-1))
as snapshot_dbm" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating MAXFILOP DBM Parameter (Maximum Total of Files Open)." >>
snapout/diagout.$a
echo "If Files Closed is > 0, best to increase the MAXFILOP parameter until the value
is zero." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select files_closed as files_closed from table(snapshot_database('$dbName', -1))
as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating Log Buffer Size." >> snapout/diagout.$a
echo "Primary Logs should be sufficient for handling normal work (MAX_SEC_LOG_USED
should be very low or zero)." >> snapout/diagout.$a
echo " If not, increase LOG_PRIMARY or LOGFILSZ." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select sec_log_used_top as max_sec_log_used, tot_log_used_top as
max_tot_log_used, total_log_used as tot_log_used, total_log_available as
32
tot_log_space_available from table(snapshot_database('$dbName', -1)) as
snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "Ensure that there is sufficient current log space. If not, increase LOGPRIMARY
or LOGFILSZ." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select (total_log_available - total_log_used) as current_log_space_avail from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "Ensure that the minimum available log space has been sufficient. If not,
increase LOGPRIMARY or LOGFILSZ." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select (total_log_available - tot_log_used_top) as min_log_space_avail from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "Ensure that the ratio of log pages read to pages written is as low as possible."
>> snapout/diagout.$a
echo " If the Log Pages read is > 0, increase LOGBUFSZ." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select log_reads, log_writes, DECIMAL((real(log_reads) / real(log_writes)),18,3)
as ratio_log_reads_to_writes from table(snapshot_database('$dbName', -1)) as
snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
# APPLHEAPSZ
# No way to monitor this. Advanced Certification Guide suggests setting to 1024 and
increasing if errors occur.
# We have seen errors in the past, so have increased to 2048
echo "######################" >> snapout/diagout.$a
echo "Evaluating Package Cache Hit Ratio." >> snapout/diagout.$a
echo "If Package Cache Hit ratio is less than .8, increase the PCKCACHESZ." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select pkg_cache_inserts, pkg_cache_lookups, DECIMAL((1 (real(pkg_cache_inserts) / real(pkg_cache_lookups))),18,3) as pkg_cache_hit_ratio from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating Locks." >> snapout/diagout.$a
echo "MAXLOCKS * MAXAPPLS should not be less than 100." >> snapout/diagout.$a
echo "If Average Locks per Appl > 1, adjust LOCKLIST or MAXLOCKS." >>
snapout/diagout.$a
echo "If Average Wait Time for Locks is high, adjust LOCKLIST or MAXLOCKS." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select appls_cur_cons, locks_held, case when (appls_cur_cons > 0 ) then
DECIMAL((real(locks_held)) / (real(appls_cur_cons)),18,3) else -1 end as
avg_locks_per_appl from table(snapshot_database('$dbName', -1)) as snapshot_database"
>> snapout/diagout.$a
db2 "select case when (lock_waits > 0 ) then DECIMAL((real(lock_wait_time)) /
(real(lock_waits)),18,3) else -1 end as avg_wait_time_for_locks from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "If the average lock wait time is high, you should look for applications that
hold many locks escalations with the focus on tuning the applicationi to improve
concurrency" >> snapout/diagout.$a
33
echo "If escalations are are high for a avarge lock wait time then the values of one
or both of the locklist and maxlocks are too low" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "Review the following." >> snapout/diagout.$a
db2 "select locks_held, lock_waits, lock_wait_time from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
db2 "select lock_list_in_use, deadlocks, lock_escals, x_lock_escals from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
db2 "select locks_waiting, lock_timeouts, int_deadlock_rollbacks from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
#-- shows the holder and waiting for any lock chains.
echo "######################" >> snapout/diagout.$a
echo "Evaluating Catalog Cache Hit Ratio." >> snapout/diagout.$a
echo "If Catalog Cache Hit ratio is less than .8, increase the CATCACHESZ." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select cat_cache_inserts, cat_cache_lookups, DECIMAL((1 (real(cat_cache_inserts) / real(cat_cache_lookups))),18,3) as cat_cache_hit_ratio from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating Asynchronous Page Cleaners." >> snapout/diagout.$a
echo "If Percent Asynch Data Writes < 90%, reduce NUM_IOCLEANERS." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select pool_async_data_writes, pool_data_writes, case when (pool_data_writes > 0
) then DECIMAL(((real(pool_async_data_writes) / real(pool_data_writes)) * 100),18,3)
else -1 end as pct_asynch_data_writes from table(snapshot_database('$dbName', -1)) as
snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "If Percent Asynch Index Writes < 90%, reduce NUM_IOCLEANERS." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select pool_async_index_writes, pool_index_writes, case when pool_index_writes >
0 then DECIMAL(((real(pool_async_index_writes) / real(pool_index_writes)) * 100),18,3)
else -1 end as pct_asynch_index_writes from table(snapshot_database('$dbName', -1)) as
snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "If Percent Bad Page Clnr Trigs is too high, add page cleaners." >>
snapout/diagout.$a
echo "If Percent Bad Page Clnr Trigs is too low, reduce page cleaners." >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select pool_drty_pg_steal_clns, pool_lsn_gap_clns, pool_drty_pg_thrsh_clns, case
when (pool_drty_pg_steal_clns + pool_drty_pg_thrsh_clns + pool_lsn_gap_clns) > 0 then
DECIMAL(((real(pool_drty_pg_steal_clns) / (real(pool_drty_pg_steal_clns) +
real(pool_drty_pg_thrsh_clns) + real(pool_lsn_gap_clns))) * 100),18,3) else -1 end as
pct_bad_page_clnr_trigs from table(snapshot_database('$dbName', -1)) as
snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "If Avg Pg Per Async Write is about 10000, cleaners are good. Otherwise, review
and adjust." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select case when (pool_drty_pg_thrsh_clns + pool_lsn_gap_clns) > 0 then
DECIMAL((real(pool_async_data_writes) + real(pool_async_index_writes)) /
(real(pool_drty_pg_steal_clns) + real(pool_drty_pg_thrsh_clns) +
34
real(pool_lsn_gap_clns)),18,3) else -1 end as avg_pg_per_async_write from
table(snapshot_database('$dbName', -1)) as snapshot_database" >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Evaluating Buffer Pool Efficiency." >> snapout/diagout.$a
echo "If ratios are low, increase size of buffer pool or move key table out to its own
tablespace." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
db2 "select bp_name, pool_index_l_reads, pool_data_l_reads, pool_index_p_reads,
pool_data_p_reads, case when (pool_index_l_reads + pool_data_l_reads) > 0 then
DECIMAL(((1 - (real((pool_index_p_reads + pool_data_p_reads)) /
real((pool_index_l_reads + pool_data_l_reads)))) * 100),18,3) else -1 end as
bpool_hit_ratio from table(snapshot_bp('$dbName', -1)) as snapshot_bp" >>
snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "Executing SQL queries to gather heavy hitter information.
Please wait ..."
echo "######################" >> snapout/diagout.$a
echo "Examining dynamic SQL by various criteria." >> snapout/diagout.$a
echo "Note that this reflects only SQL statements in cache at the time of the
snapshot." >> snapout/diagout.$a
echo "" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Top 10 SQL Statements by Total Execution Time" >> snapout/diagout.$a
db2 "SELECT substr(stmt_text,1,500) as sql_statement, total_exec_time,
total_sys_cpu_time, total_usr_cpu_time, num_executions, num_compilations FROM TABLE(
SNAPSHOT_DYN_SQL('$dbName', -1)) as dynSnapTab ORDER BY total_exec_time desc FETCH
FIRST 10 ROWS ONLY" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Top 10 SQL Statements by Average Execution Time" >> snapout/diagout.$a
db2 "SELECT substr(stmt_text,1,500) as sql_statement, total_exec_time, num_executions,
DECIMAL((real(total_exec_time) / real(num_executions)),18,9) as avg_exec_time FROM
TABLE( SNAPSHOT_DYN_SQL('$dbName', -1)) as snapshot_dyn_sql WHERE num_executions > 0
ORDER BY 4 desc FETCH FIRST 10 ROWS ONLY" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Top 10 SQL Statements by Average CPU Time" >> snapout/diagout.$a
db2 "SELECT substr(stmt_text,1,500) as sql_statement, total_sys_cpu_time,
total_usr_cpu_time, num_executions, DECIMAL(((real(total_sys_cpu_time) +
real(total_usr_cpu_time)) / real(num_executions)),18,9) as avg_cpu_time FROM TABLE(
SNAPSHOT_DYN_SQL('$dbName', -1)) as dynSnapTab WHERE num_executions > 0 ORDER BY
avg_cpu_time desc FETCH FIRST 10 ROWS ONLY" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Top 10 SQL Statements by Number of Executions" >> snapout/diagout.$a
db2 "SELECT substr(stmt_text,1,500) as sql_statement, total_exec_time, num_executions,
stmt_sorts, num_compilations, DECIMAL((real(total_exec_time) /
real(num_executions)),18,9) as avg_exec_time FROM TABLE( SNAPSHOT_DYN_SQL('$dbName', 1)) as snapshot_dyn_sql WHERE num_executions > 0 ORDER BY 3 desc FETCH FIRST 10 ROWS
ONLY" >> snapout/diagout.$a
echo "######################" >> snapout/diagout.$a
echo "Top 10 SQL Statements by Number of Sorts " >> snapout/diagout.$a
db2 "SELECT substr(stmt_text,1,500) as sql_statement, total_exec_time, num_executions,
stmt_sorts, num_compilations, DECIMAL((real(total_exec_time) /
real(num_executions)),18,9) as avg_exec_time FROM TABLE( SNAPSHOT_DYN_SQL('$dbName', 1)) as snapshot_dyn_sql WHERE num_executions > 0 ORDER BY stmt_sorts desc FETCH FIRST
10 ROWS ONLY" >> snapout/diagout.$a
AWKSCRIPT="`dirname $0`/`basename $0 .ksh`.awk"
35
RESULTS_FILE="db2_cfg"
RES_EXT=".out"
SUM_EXT=".sql"
DB_NAME=`echo $DB_NAME | tr [a-z] [A-Z]`
RES_EXT="_${DB_NAME}${RES_EXT}"
SUM_EXT="_${DB_NAME}${SUM_EXT}"
RES_OUT=snapout/${RESULTS_FILE}${RES_EXT}
SUM_OUT=snapout/${RESULTS_FILE}${SUM_EXT}
#[ $QUIET -eq 1 ] && Q_OUTPUT=">> $SUM_OUT" || Q_OUTPUT="| tee -a $SUM_OUT"
#rm $SUM_OUT 2>/dev/null
if [ $REUSE_OUT -eq 0 ]
then
echo "[Creating: $SUM_OUT]"
eval
eval
eval
eval
eval
echo
echo
echo
echo
echo
"-"-"-"-"--
$comment"
----------------------------------------"
Invocation: $0 $PARAMS"
`date`"
"
$Q_OUTPUT
$Q_OUTPUT
$Q_OUTPUT
$Q_OUTPUT
$Q_OUTPUT
echo "[Creating: $RES_OUT]"
echo ""
eval echo "-- db2 get dbm cfg"
$Q_OUTPUT
db2 get dbm cfg
> $RES_OUT
eval awk -f $AWKSCRIPT $RES_OUT $Q_OUTPUT >> snapout/diagout.$a
eval echo "-- db2 get db cfg for $dbName" $Q_OUTPUT
db2 get db cfg for $dbName
> $SUM_OUT
eval awk -f $AWKSCRIPT $SUM_OUT $Q_OUTPUT >> snapout/diagout.$a
fi
echo Snapshot is in snapout/snapall.$a
echo Diagnostic output is in snapout/diagout.$a
36