* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Test 4 - Oracle 8: Performance Tuning Workshop (Exam # 1Z0-014)
Survey
Document related concepts
Entity–attribute–value model wikipedia , lookup
Ingres (database) wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Functional Database Model wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Relational model wikipedia , lookup
Clusterpoint wikipedia , lookup
Transcript
Test 4 - Oracle 8: Performance Tuning Workshop (Exam # 1Z0014) Discussing the Nature of Tuning Tuning is done for many reasons on an Oracle database. Users often want their online applications to run faster. The developers may want batch processes to run faster as well. Management in the organization often recognizes the need for faster applications and batch processing on their Oracle databases. One solution to the problem of performance is to invest in the latest hardware containing faster processors, more memory, and more disk space. To be sure, this is often an effective solution, and methods for maximizing the hardware on the machine hosting the Oracle database will be discussed in order to improve your understanding in this area for OCP. However, the latest and greatest machines are also the most expensive. Organizations generally need to plan their hardware purchases some time in advance also, which means that acute problems with performance are not usually resolved with the hardware purchase approach. Instead, the DBA must determine other ways to improve performance. In order to meet the needs of an ongoing application that sometimes encounters performance issues, the DBA must know how to resolve those issues. Many problems with performance on an Oracle database can be resolved with three methods, the first being the purchase of new hardware described. The second and third are effective database configuration and effective application design. It should be understood by all people who use Oracle databases that by far the greatest problems with performance are caused by the application-not the Oracle database. Poorly written SQL statements, the use of multiple SQL statements where one would suffice, and other problems within an application are the source of most performance issues. The DBA should always place the responsibility of the first step in any performance situation onto the application developers to see if they can rewrite the code of the application to utilize the database more effectively. Only after all possibility for resolving the performance issue by redeveloping the application is exhausted should the DBA attempt any changes to the configuration of the Oracle database. This consideration is designed to prevent the impromptu reconfiguration of the Oracle database to satisfy a performance need in one area, only to create a performance problem in another area. Any change to the configuration of the Oracle database should be considered carefully. The DBA should weigh the trade-offs she might need to make in order to improve performance in one area. For example, when changing the memory management configuration for the Oracle database without buying and installing more memory, the DBA must be careful not to size any part of the Oracle SGA out of real memory. Also, if the Oracle SGA takes up more existing memory, other applications that might be running on the same machine may suffer. The DBA may need to work in conjunction with the systems administrator of the machine to decide how to make the trade-off. Outlining a Tuning Methodology The following steps can be presented as the outline of an appropriate tuning methodology for DBAs. Step 1 is to tune the applications. Step 2 is to tune memory structures. Step 3 is to tune disk I/O usage. Step 4 is to detect and eliminate I/O contention. These steps are the Oracle-recommended outline of tasks to execute in all tuning situations. The DBA is encouraged to use these steps when he or she isn't sure of the cause for poor performance on the Oracle database. Following the logical hierarchy or scope of each change is the important feature to remember from this section. OCP Exam 4 will focus some attention on taking the most appropriate tuning measure without making sweeping changes to the database in order to avoid causing more problems than were solved. Identifying Diagnostic Tools The most important step in solving performance issues is discovering them. While the easiest way to discover performance issues is to wait for developers or users to call and complain, this method is not very customer-oriented, and has proved very detrimental to the reputation of many information technology departments. This approach of taking performance problems on the chin is a reality that is not required, given the availability of tools from Oracle to help monitor and eliminate performance issues. Here are some utilities designed to assist in tuning the Oracle instance: UTLBSTAT This creates tables to store dynamic performance statistics for the Oracle database. Execution of this script also begins the statistics collection process. In order to make statistics collection more effective, the DBA should not run UTLBSTAT until after the database has been running for several hours or days. This utility uses underlying V$ performance views (see V$ views below) to find information about the performance of the Oracle database, and the accumulation of useful data in these views may take some time. UTLESTAT This ends collection of statistics for the instance. An output file called report.txt containing a report of the statistics generated by the UTLBSTAT utility. To maximize the effectiveness of these two utilities, it is important that UTLBSTAT be allowed to run for a long time before ending statistics collection, under a variety of circumstances. These circumstances include batch processing, online transaction processing, backups, and periods of inactivity. This wide variety of database activity will give the DBA a more complete idea about the level of usage the database experiences under normal circumstances. SERVER MANAGER Oracle's Server Manager tool contains several menu options for monitoring the database and diagnosing problems. This menu is usable when Server Manager is run in GUI mode. Since most Oracle DBAs who use UNIX as the operating system to support Oracle run Server Manager in line mode, this option may not be as familiar to them as other options that take advantage of line mode operation. EXPLAIN PLAN This command enables the DBA or developer to determine the execution path of a block of SQL code. The execution plan is generated by the SQL statement processing mechanism. This command can be executed by entering explain plan set statement_id = 'name' into plan_table for SQL_statement at the SQL*Plus prompt. The execution plan shows the stepby-step operations that Oracle will undertake in order to obtain data from the tables comprising the Oracle database. This option is provided mainly for developers and users who write SQL statements and run them against the database to avoid running inefficient SQL. The output of this information is placed into a special table created by running the utlxplan.sql script found in the rdbms/admin subdirectory under the Oracle software home directory. This table is called PLAN_TABLE. An example of an operation that obtains data from the database inefficiently is a full table scan. The execution information can be retrieved from PLAN_TABLE using a special query provided by Oracle-this query will display the information in PLAN_TABLE in a certain way that requires interpretation of the innermost indented operation outward, and then from top to bottom. ENTERPRISE MANAGER PERFORMANCE PACK This set of utilities contains products that will help the DBA identify performance issues. This package is available mainly to organizations using Oracle on servers running Windows operating systems. This option may not be as well known to DBAs in organizations using Oracle in UNIX environments, because many DBAs use Oracle database management tools in line mode. SQL TRACE This tool extends the functionality provided by explain plan by giving statistical information about the SQL statements executed in a session that has tracing enabled. This additional statistical information is provided in a dump file. This utility is run for an entire session using the alter session set sql_trace = true statement. Tracing a session is especially useful for analyzing the full operation of an application or batch process containing multiple transactions, where it is unclear which part of the application or batch process is encountering performance issues. TKPROF The dump file provided by SQL Trace is often hard to read. TKPROF takes the output in a trace file and turns it into a more understandable report. The relationship between SQL TRACE and TKPROF is similar to the relationship between IMPORT and EXPORT, as TKPROF only operates or accepts as input the output file produced by SQL TRACE. The contents of the report produced by TKPROF will be discussed. V$ VIEWS These are views against several memory structures created by the Oracle SGA at instance startup. These views contain database performance information that is useful to both DBAs and Oracle to determine the current status of the database. The operation of performance tuning tools, such as those in the Oracle Enterprise Manager or Server Manager utilities running in GUI mode as well as utilities available from third-party vendors, use the underlying V$ performance views as their basis for information. Diagnosing Problems Running UTLBSTAT and UTLESTAT These two utilities, mentioned earlier, provide the functionality required to maintain a history of performance information. To review, UTLBSTAT creates several statistics tables for storing the dynamic performance information. It also begins the collection of dynamic performance statistics. Typically, the script is found in the rdbms/admin directory under the Oracle software home directory, and is executed from within Server Manager. Though it is not necessary to execute the connect internal command before executing this query, the DBA should connect to the database as a user with connect internal privileges prior to running the query. The database overhead used for collecting these statistics, though not sizeable (depending on the platform running Oracle), can have an impact on the system as high as 10 percent. UTLBSTAT creates tables to store data from several V$ performance views including: V$WAITSTAT V$SYSTEM_EVENT V$SYSSTAT V$ROLLSTAT V$ROWCACHE V$LATCH V$SESSION V$LIBRARYCACHE V$SESSION_EVENT UTLESTAT ends the collection of performance statistics from the views named above. Typically, the script is found in the same location as UTLBSTAT, in the rdbms/admin directory under the Oracle software home directory, and it is executed from Server Manager. Though it is not necessary to execute the connect internal command before executing this query, the DBA should connect to the database as a user with connect internal privileges prior to running the query. This utility will gather all statistics collected and use them to generate an output file called report.txt. After generating report.txt, the utility will remove the statistics tables it used to store the performance history of the database. The contents of report.txt will be discussed shortly. Care should be taken not to shut down the database while UTLBSTAT is running. If this should happen, there could be problems with interpreting the data, and since the database must be running for several hours in order for the V$ views that UTLBSTAT depends on to contain useful data, all work done by UTLBSTAT will be useless. The best thing to do in this situation is to run UTLESTAT as soon as possible to clear out all data from the prior run, and wait until the database has been up long enough to attempt a second execution. Describing Contents of report.txt There are several important areas of information provided by the report.txt file. The report.txt file provides a great deal of useful information in the following areas. First, it provides statistics for file I/O by tablespace and datafile. This information is useful in distributing files across many disks to reduce I/O contention. SGA, shared area, dictionary area, table/procedure, trigger, pipe, and other cache statistics. report.txt is also used to determine if there is contention for any of several different resources. This report also gives latch wait statistics for the database instance and shows if there is contention for resources using latches. Statistics are also given for how often user processes wait for rollback segments, which can be used to determine if more rollback segments should be added. Average length of a dirty buffer write queue is also shown, which the DBA can use to determine if DBWR is having difficulty writing blocks to the database. Finally, the report.txt file contains a listing of all initialization parameters for the database and the start and stop time for statistics collection. Understanding Latch Contention Similar to locks, latches exist to limit access to certain types of resources. There are at least 40 different latches in the Oracle database. Two important latches in the database manage the redo log resource. They are the redo copy and redo allocation latch. Latches can be monitored by using the dynamic performance views V$LATCH, V$LATCHHOLDER, and V$LATCHNAME. The statistics to monitor about latches are the number of times a process has to wait for Oracle to fulfill a request it makes to obtain a latch vs. the times a process requests a latch and obtains it. A process will generally do one of two things if it is denied access to a latch. Its request is either immediate or willing to wait. These two types of process requests affect the collection of V$LATCH statistics in the following way. V$LATCH tracks the number of GETS, MISSES, and SLEEPS for processes willing to wait on a request for a latch. A process will sleep if its latch request is denied and the process wants to wait for it to become available. V$LATCH also calculates the number of IMMEDIATE_GETS and IMMEDIATE_MISSES for those processes requesting latches that want the latch immediately or the process moves on. Latch wait time for processes willing to wait is based on the number of MISSES divided by GETS, times 100, or (MISSES/GETS)*100. Latch wait time for processes requiring immediate access to latches or the process will move on is based on the number of IMMEDIATE_MISSES divided by IMMEDIATE_GETS, times 100, or (IMMEDIATE_MISSES/IMMEDIATE_GETS) *100. Checking for Events Causing Waits Events that are causing waits appear in both the V$LOCK dynamic performance view and the V$LATCHHOLDER view. Both these views list information about the processes that are currently holding the keys to access certain resources on the Oracle database. If there are high wait ratios associated with a process holding a latch or lock, as reflected by statistics gathered from V$LATCH or by the presence of a process ID in the V$LOCK view, then there could be a contention issue on the database. Demands of Online Transaction Processing Systems Online transaction processing, or OLTP, is a common system in many organizations. When you think about data entry, you are thinking about OLTP. These types of applications are characterized by high data change activity, such as inserts or updates, usually performed by a large user base. Some examples of this type of system include order entry systems, ticketing systems, timesheet entry systems, payments received systems, and other systems representing the entry and change of mass amounts of data. The diagram below shows information about data volume and direction on OLTP systems. Data in these systems is highly volatile. Because data changes quickly and frequently, one design characteristic for OLTP systems is the ability to enter, change, and correct data quickly without sacrificing accuracy. Since many users of the system may manipulate the same pieces or areas of data, mechanisms must exist to prevent users from overwriting one another's data. Finally, because users may make changes or additions to the database based on existing data, there must be mechanisms to see changes online quickly. There are several design paradoxes inherent in OLTP systems. First, OLTP systems need to be designed to facilitate fast data entry without sacrificing accuracy. Any mechanism that checks the data being entered will cause some performance degradation. Oracle provides a good structure for checking data entry in the form of integrity constraints, such as check constraints and foreign keys. Since these mechanisms are built into the data definition language, they are more efficient than using table triggers to enforce integrity. Oracle then solves this paradox for all but the most complex business rules that must be enforced with triggers. Typically, OLTP systems have a need to see the data in real time, which creates one of the largest design paradoxes in OLTP systems. Oracle uses several mechanisms to facilitate data retrieval. Those mechanisms are indexes and clusters. Indexes and clusters work better on tables that experience infrequent data change. This is true for indexes because every table data change on an indexed column means a required change to the index. In the case of clusters, since the cluster must be carefully sized to allow so much data to hang off the cluster index, data changes in clustered tables can lead to row migration and chaining-two effects that will kill any performance gains the cluster may give. However, data change is the primary function of an OLTP system. The designers and DBAs of such systems then must work with users to create an effective trade-off between viewing data quickly and making data changes quickly. This goal can be accomplished through data normalization. By reducing functional dependency between pieces of information as part of the normalization process, the database can store pieces of data indexed on the table's primary key. This design feature, used in combination with a few appropriately created foreign keys to speed table joins, will provide data retrieval performance that is acceptable in most cases. If possible, DBAs should participate in the data modeling process to better understand which tables are frequently updated. In general, it is wise for the DBA to put tables that are frequently updated in a special data tablespace that is backed up frequently. Also, that tablespace can have default settings for data blocks with a high pctfree and a low pctused to reduce the chances of data migration and row chaining. Although configuring data blocks in this way can waste disk space, the desired effect of preventing row migration is obtained. Finally, keep use of indexes as low as possible to prevent the overhead involved in updating both the table and the index. Demands of Decision Support Systems Decision support systems (DSS) pose several different needs on the database than OLTP applications. Decision support systems typically store large amounts of data that is available for a small user base via reports. These reports may be intensive in the amount of CPU time they take to execute. Fast access to volumes of data and well-tuned SQL queries as the underlying architecture for reports are the keys to success in DSS applications. Mechanisms that enhance query performance, such as clustering and indexes, are commonly employed to produce reports quickly. Since data changes will most likely be handled by batch processes at off-peak usage times, the performance overhead on change activities will probably not affect database performance or user satisfaction. Decision support systems often coexist with OLTP systems, as a reporting environment, archive repository, or both. Requirements of Client/Server Environments The other major type of application discussed is the client/server application. Often in client/server architectures, there are many client processes attempting to request data from only one server. As such, DBAs often use the multithreaded server options that are available with Oracle Server. Although two of the major components in the client/server architecture are identified by the name, there is an unseen element to the architecture that is as important to the success of the system as the client and the server. That component is the network that transports communication between client and server processes. However, it was explained that simply having a connection between two machines isn't enough; the processes require a common "language" to use for interprocess communication. Oracle offers a solution to this issue with SQL*Net, a program designed to facilitate communication at the network layer between clients and Oracle Server. One of the main components of SQL*Net is the listener process. This process is designed to listen to the network connection for incoming data requests from client processes. When such a request comes across the network, the listener process takes the request and routes it along in the database to a server process that will handle such tasks as obtaining data from Oracle datafiles on behalf of the user process. There are two ways that server processes run in Oracle. The first way is known as dedicated servers. In the dedicated server configuration, every user process that access Oracle has its own server process handling disk reads and other Oracle database processing. The relationship between Oracle Server processes and user processes is therefore one to one. However, it was discussed that this configuration requires extra memory on the database. Another option is for the DBA to use the multithreaded server (MTS) configuration. In this architecture, Oracle has a limited number of server processes that handle disk reads for multiple user processes, such that the relationship between server processes and user processes is many to one. In the MTS architecture, access to the server processes are brokered by another process called a dispatcher. User processes are connected to the dispatcher by way of the listener, and the dispatcher connects the user process to a shared server process. Tuning SQL Statements Since SQL tuning often produces the most noticeable benefits for an application or query, it is generally the best place to start when tuning for performance gains. It also is the best place to start because of scope reasons. The DBA should determine that everything has been done to optimize the SQL statement before making sweeping changes to memory-, I/O- and contention-related areas that have the potential to break the database in other areas. Simply adding an index or ensuring that an index is being used properly can save problems later. Identifying Inefficient SQL Potential causes for inefficient SQL start with ineffective use of indexes associated with large tables, or lack of existence of those indexes for large tables. If there is no index present for queries against large tables to use, or if all data is selected from the table, or if a data operation (such as nvl, upper, etc.) is performed on an indexed column name in the where clause, then the database will perform a full table scan against the database in order to obtain the requested information. The performance of this query will be in direct proportion to the number of rows in the table. Another cause for inefficient SQL is use of B-tree indexes of columns that contain few unique values. This low uniqueness, or cardinality, causes the index search to perform poorly. If a column with low cardinality must be indexed, the DBA should use a bitmap index rather than a B-tree index. Another cause of poor SQL statement performance involves views. If a view is utilized in a query, the best performance will occur if the underlying view selection statement is resolved as a component of the overall query, rather than the view resolving first and then the rest of the query being applied as a filter for the data returned. Finally, another cause for poor performance on a table that uses indexes consisting of multiple columns from the table, sometimes called concatenated or composite indexes, is the user's use of those columns in the composite index in the wrong order on the SQL where clause. When using composite indexes to search for data, the query must list all columns in the where clause in the order specified by the composite index. If the user is unsure about column order in a composite index, the user can consult the ALL_IND_COLUMNS data dictionary view for that table's indexes and the column order. Tools to Diagnose Problems To better tune query performance, it helps to use the explain plan statement offered by Oracle in conjunction with the database's SQL processing mechanism. Any user can use the explain plan statement in conjunction with the Oracle database provided they have a special table called PLAN_TABLE created in their user schema. This table can be created using the utlxplan.sql script found in the rdbms/admin directory off the Oracle software home directory. The results of the explain plan command are stored in the PLAN_TABLE. It is important to include a unique statement_id in the explain plan command syntax to facilitate retrieving the execution plan for the query. Using Autotrace The query listed in the text is used to retrieve the execution plan data from the PLAN_TABLE. The execution plan itself is read from the top of the plan downward, with results from inner operations feeding as input into the outer operations. It is important to watch out for full table scans, as these are the operations that are so detrimental to query performance. The database session can also be configured to generate execution plans using the set autotrace on command. There are advantages and drawbacks to using this approach. One advantage is that the user does not need to specify the explain plan syntax for every statement executed during the session; autotrace simply generates the plan and stores it. However, the execution plan for a query will not be available until the statement completes, making it difficult to proactively tune the query. Using SQL Trace and TKPROF Other tool options for tuning SQL statements include the SQL Trace feature used in conjunction with TKPROF. SQL Trace tracks performance statistics for SQL statements, placing numbers next to the operations provided by explain plan. These statistics include the number of times the statement is parsed, executed, and how often records are fetched if the statement is a select; and CPU elapsed time and real elapsed time for the query, block reads, processed rows, and library cache misses. SQL Trace must be used in conjunction with proper settings for the TIMED_STATISTICS, MAX_DUMP_FILE_SIZE, and USER_DUMP_DEST parameters. The output of a SQL Trace dump file is difficult to read, giving rise to the need for TKPROF. This tool takes the trace file as input and produces a report of statistics named above for queries executed in the given session. Tracing can be specified on a session basis or for the entire instance. To enable tracing, alter the SQL_TRACE parameter either with the alter session statement (session-wide) or by specifying it in the init.ora file (instance_wide) and restarting the instance. Using the DBMS_APPLICATION_INFO Package To track performance statistics in large, complex code environments, it helps to register code components with the database using the DBMS_APPLICATION_INFO package. This package provides functionality to name various modules of code and identify the actions they perform. Statistics for the execution of various actions within modules and the modules themselves can then be tracked using the dynamic performance views V$SESSION and V$SQLAREA. Tuning the Shared Pool The shared SQL pool consists of some major components required by users of the database. The two major structures that exist in the shared pool are the dictionary, or "row," cache and the library cache. The dictionary cache stores data from rows in data dictionary tables in memory for use by user processes to improve the performance of queries against the data dictionary. The library cache contains several elements, including parsed SQL statements, for the purpose of minimizing storage costs of the parse information and speeding the execution of SQL in the event that multiple users are executing the same statements. Additionally, the library cache contains executable versions of PL/SQL packages and procedures. Finally, the library cache contains control structures like locks and cache handles, or addresses in memory. Each object in the shared pool is designed to produce improvements in performance on various aspects of the Oracle database. The performance of each of these objects is quantitatively determined using a calculation of a hit ratio. "Hits" are defined relative to the object being quantified. In the case of the row cache, a hit is when a process or Oracle looks for data from the data dictionary and finds it in the row cache. On the library cache, hits are defined as when a process needs to execute a SQL statement and finds it already parsed and waiting in the library cache. Measuring the Shared Pool Hit Ratio Quantification of the performance for the library cache is accomplished by calculating a hit ratio. The hit ratio is determined first by pulling the relevant statistics from the appropriate dynamic performance view. In this case, the DBA will need to work with the V$LIBRARYCACHE performance view to obtain the statistics collected for the PINS and RELOADS on the library cache. A pin is when the user process needs to parse a statement in the library cache only to find that a parsed version of it already exists in the cache. Since this find occurred, and thus the parse tree was recently used, the parsed statement will stay "pinned" in the buffer cache as a most recently used object. RELOADS represent the number of times Oracle had to reparse a statement because the period of time between the parsing of the statement and the execution of the statement were spaced far enough so that Oracle had actually eliminated the parse tree from the library cache in order to make room for another statement. Reloads indicate that there is either a great deal of activity on the database or a great many unique statements being executed-for example, in the case of many users permitted to run ad hoc SQL against the database. The reload and pin statistics come from the RELOAD and PIN columns of the dynamic performance view V$LIBRARYCACHE. The formula for calculating the hit ratio for the library cache is defined as (RELOADS/PINS) * 100. In order to quantify this occurrence on the row cache, the dynamic performance view V$ROWCACHE must be queried for the statistics in the GETS and GETMISSES columns to calculate the hit ratio for the row cache. The formula for calculating the hit ratio is (GETMISSES/GETS) * 100. If the result of this query is under about 10-15, then the hit ratio and performance for the library cache should be satisfactory. A value above that may not produce satisfactory performance, either for Oracle or for the user processes. Monitoring Latches to Detect Shared Pool Contention Sizing the Shared Pool When there is a performance issue with the library cache or the dictionary cache, the shared pool must be resized. The shared pool needs to be sized in an appropriate manner. The initialization parameter that handles increases shared pool size is SHARED_POOL_SIZE. This variable is measured in bytes. Care should be taken when performing operations that increase the size of any Oracle memory structure to make sure that no part of the SGA is sized out of the real memory that is available for the system. The best approach to increasing the size the shared pool is to keep the overall size of the SGA the same as before, and simply reallocate memory from one area of the SGA to another, or to add physical memory to the hardware and allocate it to Oracle. Pinning Objects in the Shared Pool It may become useful in some instances to place objects in the shared pool for a long-term basis. The types of objects that the DBA may want to place in the shared pool on that longerterm basis are objects that go in the library cache. This structure stores parsed statement information that can be reused by identical statements executing within the Oracle database. Some reasons for pinning objects in the shared pool include desiring a performance increase for a statement not used frequently enough for Oracle to pin the SQL statement parse information in the system according to the LRU algorithm it uses to eliminate older SQL parse information to make room for new SQL statements; or there is a memory fragmentation issue that prevents a large SQL or PL/SQL block from entering the library cache for parsing. In general, the steps required for pinning objects in the shared pool are: 1. Free all space in the shared pool, either by flushing the shared pool or by restarting the instance 2. Reference the object to be pinned 3. Execute the keep( ) procedure in the DBMS_SHARED_POOL package, passing it the name of the object and a flag indicating what type of SQL code block it is, either P, C, or R for procedure, cursor, or trigger, respectively. Tuning Redo Mechanisms & Determining Contention Between ARCH and LGWR Redo log mechanisms are the next critical portion of the SGA for which tuning is covered in this section. The primary function of the redo log buffer is to store redo entries in memory until LGWR writes them to disk. It is recommended in all but the least critical database application situations to save the redo log files in the form of using the archivelog feature of the Oracle database. Archiving is often handled automatically with the use of the ARCH process, which handles the copying of online redo logs on disk to the archived destination. Archiving is highly recommended; however, there are some performance considerations that the DBA should be aware of that may put the LGWR and ARCH process in contention. If for some reason ARCH cannot archive a redo log, and LGWR fills all the online redo logs with redo information, operation on the database will stop until the DBA takes care of the archiving issue. Setting Appropriate Checkpoint Intervals The issue of determining checkpoint intervals presents another interesting set of considerations for the DBA. During normal database operation, LGWR writes redo entries from the redo log buffer to disk whenever user processes commit their transactions. A checkpoint is a point in time when LGWR stops writing redo information in order to write the redo log sequence to datafile headers and to the control files of the database, and to tell DBWR to write dirty buffers from the dirty buffer write queue to disk. At the time a checkpoint occurs, performance of online applications may momentarily drop as LGWR stops writing redo log entries. The more frequent the checkpoints, the more often this performance hit will occur. However, the more checkpoints, the more current the datafiles, and the more effective the instance recovery will be in the event of a failure on the database. Decreasing the number of checkpoints is done by increasing the LOG_CHECKPOINT_INTERVAL to a number higher than the largest redo log entry or by eliminating time-based checkpoints by setting LOG_CHECKPOINT_TIMEOUT to zero. Also, checkpoints can be reduced in frequency by increasing the size of redo log files, which effectively makes the redo log files accept more entries before reaching capacity and making a switch. Finally, the CKPT process can be enabled to handle writing log sequence information to the datafile headers and the control files in place of LGWR by setting the CHECKPOINT_PROCESS to TRUE. Determining Contention for the Redo Log Buffer If user processes write redo information to the redo log buffer faster than the LGWR process can copy the redo log entries to disk, user processes may be temporarily unable to write records to the redo log buffer. If such a wait situation occurs too frequently, the space allocated to the redo log buffer can be increased. In order to determine if the user processes are waiting for space in the redo log buffer, the DBA can query the V$SYSSTAT performance view to find information in the NAME and VALUE columns, where the name in the NAME column is 'redo log space requests'. Ideally, this statistic should be stable, and as close to zero as possible. If it is high or increasing, the DBA should increase the space allotted to the redo log buffer by changing the value for the initialization parameter LOG_BUFFER, which is expressed in bytes. However, as with resizing the shared pool size, care should be taken so as not to increase the size of the SGA so as to make it larger than what can fit in real memory. Relieving Contention for the Redo Allocation Latch In conclusion to this treatment of the redo log buffer, access to write redo log entries is controlled by two latches, called the redo allocation latch and the redo copy latch. There is one redo allocation latch in the entire Oracle database to ensure sequential entries to the online redo log. In heavy transaction processing application environments, there can be some contention for the redo allocation latch. Some approaches for solving that contention problem center around shortening the amount of time any process can hold the latch. There are two ways to do this. One is to reduce the size of the entry any process can write to the online redo log. This task is accomplished by the DBA by decreasing the value, expressed in bytes, for the LOG_SMALL_ENTRY_MAX_SIZE parameter. The other method is to require processes to build their redo log entry before calling Oracle for the redo allocation latch. This task is also accomplished by the DBA by setting the LOG_ENTRY_PREBUILD_THRESHOLD initialization parameter to a value in bytes that is high enough such that any redo log entry that falls below that threshold will have to be prebuilt. Tuning the Buffer Cache The final area of tuning the Oracle SGA is the buffer cache. This area of memory is used to store a number of recently used database blocks. The principle behind the buffer cache is that more recently used buffers may be used again by the database, and if so, Oracle can speed the performance of queries requiring them by caching the blocks in memory. The Buffer Cache Hit Ratio In order to determine if the size of the buffer cache is effective, the buffer cache hit ratio can be calculated using statistics gathered from the V$SYSSTAT dynamic performance view on the Oracle database. The statistics to be gathered are contained in this view as statistics in the VALUE column, corresponding to where the name in the NAME column equals 'db block gets', 'consistent gets', and 'physical reads' from this view. The calculation of the hit ratio for the buffer cache is PHYSICAL READS/(DB BLOCK GETS +CONSISTENT GETS) * 100. Determining Whether to Resize the Buffer Cache As stated earlier, there are situations where the buffer cache may need to be increased or decreased, depending on the amount of memory available or added to the system and the memory requirements of other areas of the SGA. If real memory doesn't change, and if the size of another area of SGA changes, the DBA should first consider altering the size of the buffer cache to compensate. There are two performance views that are used to determine the impact of adding or removing buffers from the buffer cache. The names for these structures are X$KCBCBH and X$KCBRBH. The method to distinguish which view assists in which function is the three-letter rule of X$: X$KCBRBH reduces buffer hits by examining increases to the buffer cache, while X$KCBCBH creates buffer hits by examining decreases to the buffer cache. Use of these views is enabled by setting the DB_BLOCK_LRU_EXTENDED_STATISTICS instance initialization parameter to TRUE. Using Table Caching Oracle eliminates data blocks from this cache based on the premise that blocks used least recently can be eliminated. One exception exists to prevent full table scans on large tables from eliminating all other blocks potentially being used by other processes, in that blocks loaded into the buffer cache as a result of full table scans will be eliminated first. In some cases, however, small nonindexed tables may be used to store information such as valid values that may be useful to many processes. In order to prevent the database from eliminating those blocks quickly, the DBA can identify tables that should be cached in the event that they are used via full table scan by issuing an alter table tablename cache statement. Monitoring the Buffer Cache Finally, it is important for the DBA to always remember to keep the SGA sized such that it always fits into real memory. If the SGA is sized out of real memory, the resultant paging between memory and disk will be extremely detrimental to the overall performance of the database. Use of SYSTEM, RBS, and TEMP Tablespaces The five different types of tablespaces discussed are RBS, DATA, SYSTEM, INDEX, and TEMP. Database objects are meant to be placed into these types of tablespaces according to the following breakdown: rollback segments in RBS, tables in DATA, indexes in INDEX, temporary segments required by user processes in TEMP, and the data dictionary tables and initial rollback segment in SYSTEM. Since rollback segments and temporary segments have a tendency to fragment, it is generally wise to keep them out of the tablespaces used by tables and indexes. Particular importance is placed on the SYSTEM tablespace. Since this tablespace contains very important objects such as the data dictionary and initial rollback segment, it is unwise to place any other types of objects in this tablespace. Placing many objects like tables and indexes in SYSTEM can cause a problem. If the SYSTEM tablespace should fill, the only way to add space to it is to drop and re-create it. However, the only way to drop and re-create the SYSTEM tablespace is to drop and re-create the entire database. This act requires a full restore of all data, and generally creates huge problems. Therefore, it is inappropriate to place anything other than the data dictionary and the initial rollback segment in the SYSTEM tablespace. Configuring Rollback Segments There are two types of rollback segments, public and private. In databases that do not use the Parallel Server Option, public and private rollback segments are the same. In databases that use the Parallel Server Option, public rollback segments are a pool of rollback segments that can be acquired by any instance in the parallel configuration, while private rollback segments are acquired by only the instance that names the rollback segment explicitly. The number of public rollback segments acquired at startup depends on a calculation depending on two initialization parameters, TRANSACTIONS / TRANSACTIONS_PER_ROLLBACK_SEGMENT. This value produces the number of rollback segments the Oracle instance will acquire at startup. The DBA can guarantee that certain private rollback segments are acquired as part of that number by specifying a set of rollback segments in the ROLLBACK_SEGMENTS initialization parameter as well. In order to determine that size, the DBA should attempt to find out as much as possible regarding the size of transactions that will take place in the database. Part of database rollback segment configuration involves choosing an optimal size for the rollback segment. This size is specified as the optimal storage clause, as part of rollback segment creation. Another important feature to remember about rollback segment creation is that all extents of the rollback segment will be the same size. This design choice alleviates the possibility for a long-running query to force a rollback segment to fill the associated tablespace with its extents, making it impossible for any other object in that tablespace to obtain an extent. Oracle enforces this design by removing the pctincrease storage clause from the syntax of the create rollback segment statement. Distributing Files to Reduce I/O Contention File distribution to minimize contention is covered. There are specific means to evaluating which Oracle resources are good to place together on the disks of the machine hosting Oracle. The most important feature of this discussion is to recall what the different components are and how they might interact (and more importantly, interfere) with one another. Some resources are best placed on separate disks to minimize I/O contention. They are DATA tablespaces and INDEX tablespaces, RBS tablespaces and redo logs, DATA tablespaces and the SYSTEM tablespace, DATA tablespaces and RBS tablespaces, and DATA tablespaces and TEMP tablespaces. Some acceptable combinations of resources on the same disk are redo logs and control files, all RBS tablespaces, and others. Using Disk Striping For additional reduction of I/O contention in the DATA tablespaces, the option of table striping is explored. Table striping is the practice of placing different extents of a large table in datafiles on separate disks. This method has excellent benefits for SQL queries running with parallel query when searching on nonindexed columns, which results in full table scans. Parallel query makes better use of multiple CPUs and disk controllers that are available with disk striping. Tuning Rollback Segments Tuning rollback segments begins with identifying how to detect contention for rollback segments-the detection of contention in memory for buffers containing rollback segment data. Using V$ Views to Monitor Rollback Segment Performance The V$WAITSTAT dynamic performance view is used to determine whether this contention exists. There are four different classes of rollback segment blocks in use in the Oracle instance. They are the system undo header block, the system undo block, the undo header, and the undo block. The difference between header blocks other rollback blocks is that header blocks are ones that contain rollback block header information. The difference between system blocks and other types of blocks is that the system blocks correspond to blocks in the SYSTEM tablespace, while the other blocks are contained in other rollback tablespaces. Whether there is contention for these blocks is determined by the wait ratio, which is derived by (WAITS / GETS) * 100, where waits is the sum of block waits for the types of blocks listed above taken from the V$WAITSTAT performance view, and gets is the total number of data requests as represented by the sum of database block gets and consistent gets from the V$SYSSTAT performance view. The rule of four-divide the total number of concurrent transactions by 4. If the number of concurrent transactions is under 32, round the quotient of the previous equation up to the nearest 4. And finally, the total number of rollback segments used in any database instance should not exceed 50. Modifying Rollback Segment Configuration Dynamic extension of rollback segments should be avoided. The current space allocation of any rollback segment can be determined by querying either the DBA_SEGMENTS view or the V$ROLLSTAT view. Preference is given to the V$ROLLSTAT view, as it serves as the basis for more user-friendly monitoring interfaces like Server Manager, although a join on V$ROLLNAME must be performed in order to pull the statistics for a rollback segment based on rollback segment name. In order to keep the rollback segment at the optimal size that was specified by the optimal clause in rollback segment creation, the instance will perform shrinks on the rollback segment if too many extents are acquired for it. A high number of shrinks as reflected by a high number in the column of the same name in the V$ROLLSTAT performance view indicates that the optimal clause set for the rollback segment is too low. Since allocating an deallocating extents for rollback segments creates additional processing overhead for the Oracle instance, the DBA should carefully monitor the database rollback segment statistics and resize the optimal clause as necessary. Shrinks occur in the rollback segment after a transaction commits that required the rollback segment to grow more than one extent beyond the size specified by its optimal storage clause. Shrinking a rollback segment can be accomplished by the DBA manually by executing the alter rollback segment shrink statement. If no value is specified in bytes that the rollback segment is to shrink to, the rollback segment will shrink to the size specified by the optimal storage parameter. The WRAPS statistic that is also maintained in the database can be of some limited value. The number of wraps in an instance's rollback segments indicates that active transactions could not fit into the current extent, and the rollback data had to wrap across to a new extent. When a high WRAPS statistic appears in conjunction with a high value for the EXTENDS column on the V$ROLLSTAT dynamic performance view, then there is ample evidence to confirm that the rollback segments are extending often (and later shrinking) and that there could be a performance problem with excessive SHRINKS and EXTENDS occurring. However, a high number of wraps by itself indicates simply that transactions cannot fit entirely into one extent. A high number for WRAPS in conjunction with a low number of EXTENDS could indicate that the rollback segment is reusing currently allocated extents, which is a sign that the rollback segment is properly sized to handle the number of transactions assigned to it. Allocating Rollback Segments to Transactions A problem can occur in the database rollback segments when long-running queries are attempting to access data that is volatile as a result of many smaller queries happening to the database, or if the long-running query is making many data changes to the database. If a query requires so many rollback entries to stay active in order to have a read-consistent view that the rollback segment allocates as many extents as it can, and the query still can't finish, then ORA-01555-snapshot too old (rollback segment too small) will appear. Although this error can be corrected by adding more space to a rollback segment, the more ideal solution is to schedule the long running job to run at off-peak times to lessen the burden on rollback segments. Alternately, this problem can be solved by assigning the transaction to a rollback segment specifically designed to accommodate larger transactions. This task is accomplished by using the set transaction use rollback segment statement. Determining Block Size The first part of this discussion focused on the size of database blocks, which is specified by the DB_BLOCK_SIZE initialization parameter at database creation time. Typically, database block size is a multiple of operating system block size to minimize the number of physical I/O reads it takes for Oracle to retrieve data. DB_BLOCK_SIZE is stated in bytes and determines the value for DB_BLOCK_BUFFERS, the size of block buffers in the database buffer cache of the SGA. Once the database is created, block size cannot be changed except by dropping and re-creating the database. Setting PCTFREE and PCTUSED In this portion, the topic of pctfree and pctused is also discussed. These two storage options determine how Oracle inserts new rows into a database object. We learned that pctfree represents the portion of each data block that Oracle leaves free for growth to existing rows in the block as a result of updates to those rows. When the block is filled, Oracle takes that block off the freelist for that table. The pctused option is the percentage amount of the data block that must be free in order for Oracle to consider placing that block back on the freelist. The range for each value is 0-99, but the sum of pctfree and pctused cannot exceed 100. For performance reasons, pctfree and pctused should be set such that their total is close to, but less than, 100. The impact of setting pctfree is as follows: High pctfree leaves more space free in the data block for each row to expand during updates. However, it will take more blocks to store the same number of rows than it takes if pctfree is set low. Low pctfree maximizes block space usage by leaving less space free for existing rows to grow during updates. But, there is an increased chance of row migration and chaining if the block becomes overcrowded and the row needs to expand. Setting pctused has many implications on the database. The implications of setting pctused are as follows: High pctused means that Oracle will try to keep the data block filled as high as pctused at all times. This means an additional processing overhead if the database experiences heavy data change activity. Low pctused means that Oracle will not add rows to a data block until much of the block space is freed by data deletion. The data block will have unused room for a while before being placed onto a freelist, but once it is on the freelist, the block will be available for row inserts for a while as well. Detecting and Resolving Row Migration When a row grows too large to fit into a data block, Oracle finds another data block to place it into. This process is called row migration. If there is no block available that can fit the entire row, then Oracle breaks the row into several pieces and stores the components in different blocks. This process is called chaining. These two processes are detrimental to the performance of the database. Table rows that have been chained or migrated can be identified using the analyze command with the list chained rows option. The output from this command will be placed in a special table called CHAINED_ROWS, that must be created by executing the UTLCHAIN utility script before executing the analyze command. The DBA can then copy the chained rows into a temporary table, delete the rows from the original table, change the value for pctfree on the table, and insert the rows from the temporary table. Alternately, the DBA can store the row data using EXPORT or a flat file, drop and re-create the table using a new pctfree setting, and repopulate the table using IMPORT or SQL*Loader. Detecting and Resolving Freelist Contention Freelists are lists that Oracle maintains of blocks with space for row insertion for a table. A freelist is experiencing contention if processes are contending for the free data blocks of that table in memory. To calculate the wait ratio for freelists, the V$WAITSTAT and V$SYSSTAT are used, (WAITS / GETS) * 100, where waits is V$WAITSTAT.COUNT for the associated V$WAITSTAT.CLASS = 'free list', and gets is the sum of V$SYSSTAT.VALUE where V$SYSSTAT.NAME is either 'db block gets' or 'consistent gets'. Resolving contention for freelists is accomplished by dropping and re-creating the table with the freelist storage parameter set to the number of concurrent processes trying to insert new rows into the table. The table data can be stored and reloaded using IMPORT/EXPORT or SQL*Loader. Monitoring and Detecting Lock Contention Levels of Locking in Oracle There are five different types of locks on the Oracle database. Shared row locks entitle their holder to changing data in the locked row, but allow other users to access the locked data via query at any time. These locks are acquired automatically by select for update statements. Exclusive row locks entitle the holder to exclusive select update access to the locked row. No other user can see or change data in the locked row. These locks are acquired automatically by the update statement. Shared locks are table locks that permit the holder of the lock the ability to change any row in the table, but any user on the database can view the data being changed at any time. Exclusive locks are table locks that permit the holder of the lock exclusive access to view or change data in a table. No other user can query or change data in that table for as long as the lock is held by that user. The final type of lock is the shared row exclusive lock. These locks are all acquired through calls to a special package that manages locks, called DBMS_LOCK. Identifying Possible Causes for Contention Several possibilities for contention exist on the Oracle database. One area for contention arises from a process having a greater level lock than it needs to execute an update. Another example is when the application acquires a lock in an area that otherwise behaves in a "select only" manner, and as such never relinquishes the lock it is given. Another possibility for contention exists in the client/server architecture, when a user process drops off the database but still holds locks on the database. It takes some time before Oracle realizes the connection was lost and allows the transaction to roll back. During that time, there could be contention as the user process tries to connect again and execute the same process they were executing when connectivity was lost before. Using Tools to Detect Lock Contention There are utilities and views to discover what processes are experiencing and causing waits. The UTLLOCKT utility provides a tree diagram of all processes that are holding locks and all processes that are waiting for the lock to be released. Another method for updating data of this type is to query the DBA_WAITERS view of the Oracle data dictionary. Resolving Contention in an Emergency The method DBAs use to resolve contention issues in an emergency is to kill one or both of the processes in contention. For this to occur, the DBA needs the session identifier and the serial number (SID and SERIALl# columns of V$SESSION, respectively) for the process(es) that will be killed. This information comes from the V$SESSION dynamic performance view, where the session ID equals the holding session from DBA_WAITERS. The syntax for killing a session is to issue the alter system kill session statement, passing the value for session ID and serial number. Preventing Locking Problems The best method for preventing a locking problem is to identify and solve the problem at the application level. Typically, the problem is being caused by a process that is acquiring locks that it isn't giving up, or the process is acquiring locks that have more scope than the process really needs. The solution is to change the application to release the locks it acquires or to use locks with the least "locking power" necessary to accomplish what it is trying to accomplish. Identifying and Preventing Deadlocks Deadlocks are a particularly serious locking problem. Oracle is designed to resolve certain deadlocking situations automatically. The DBA can identify if these situations are occurring by checking the alert log for the Oracle instance for the "deadlock detected while waiting for a resource" error. Preventing deadlocks is accomplished by the DBA making the following two recommendations at the application level: ensure that all processes acquire locks in the same order, and always use the lock with the least amount of "locking power" required to carry out the transaction. Tuning Sort Operations The processes that require sorts are group by, order by, select distinct, minus, intersect, union, min( ), max( ), count( ), and sort join internal Oracle operations and the creation of indexes. Sorting should be done in memory, where possible, to ensure maximum performance. The frequency of sorts occurring in memory can be assessed using the V$SYSSTAT performance view, where the value in the NAME column equals 'sort(memory)' or 'sort(disk)'. To increase the sorts performed in memory, the size of the sort area can be increased, although the DBA should be careful to ensure that once increased, the entire Oracle SGA still fits in real memory. The size of the sort area is determined by the value, in bytes, of the SORT_AREA_SIZE initialization parameter. Allocating Temporary Disk Space for Sorts When using sorts that require disk space, the user processes should always use the special TEMP tablespace to allocate their temporary segments for the use of sorting. The DBA can ensure this happens by configuring all users with a temporary tablespace at user creation. If the TEMP tablespace is not created or specified, the user processes will allocate temporary segments in the SYSTEM tablespace. Since temporary segments have a tendency to fragment tablespaces, and since the SYSTEM tablespace is critical to the functionality of the Oracle database, it is ill-advised to use the SYSTEM tablespace for anything other than storing data dictionary tables and the initial rollback segments of the database. Using Direct Writes for Sorts This option should only be used on systems that have extensive memory and disk resources available for improving performance on sorts. Direct writes provide the functionality of allowing Oracle to write temporary blocks for sorts directly to disk, bypassing the buffer cache of the SGA. To use this option, three initialization parameters must be used. Those parameters are SORT_DIRECT_WRITES (set to TRUE to enable), SORT_WRITE_BUFFERS (which represents the number of buffers used for sort direct writes), and SORT_WRITE_BUFFER_SIZE (which is the size of the buffers used). When using this setup for performance improvements on sorts, the DBA should ensure that there is additional memory available on the system to accommodate the SGA size plus the value of SORT_WRITE_BUFFERS * SORT_WRITE_BUFFER_SIZE. Additionally, the value of the above equation should be no more than 10 percent of the value originally specified for the SORT_AREA_SIZE or there may be little performance gain from using sort direct writes. Optimizing Load on the Oracle Database Configuring the SQL*Net Listener The first area covered in this section is the configuration of the SQL*Net listener process. Initialization parameters for this process are contained in the listener.ora file. The name of the SQL*Net listener is usually LISTENER, but on machines running multiple versions of listener corresponding to multiple networks connected to the Oracle database, the names of the listener must be unique. Listener has several parameters that must be configured, including PASSWORDS, SID_LIST, STARTUP_WAIT_TIME, CONNECT_TIMEOUT, TRACE_LEVEL, TRACE_DIRECTORY, TRACE_FILE, LOG_DIRECTORY, and LOG_FILE. Each parameter can be set using the LSNRCTL utility that handles listener configuration. Each parameter listed above should have the name of the listener appended to the end of it in the listener.ora file. Configuring Dispatchers The next area to consider when optimizing load on the multithreaded server architecture is configuring the dispatchers on the Oracle server. The dispatchers act as go-betweens to map user processes coming into Oracle via the client/server architecture to shared servers that obtain data from the database on behalf of the user process. The number of dispatcher processes is determined by the MTS_DISPATCHERS initialization parameter, and its default value is five. Determining contention for dispatcher processes is based on two statistics: the busy ratio and the wait ratio. The busy ratio is calculated based on statistics from the V$DISPATCHERS dynamic performance view, while the wait ratio is calculated based on statistics from the V$QUEUE view. The busy ratio is based on the total busy time divided by (busy time plus idle time), times 100. Wait ratio is calculated from the sum of waits divided by the sum of values in the TOTALQ column from the V$QUEUE view, times 100. If the busy ratio is over 50, or if the wait ratio is increasing steadily, the DBA can increase the number of dispatchers by increasing the value set for MTS_DISPATCHERS-by executing an alter system set mts_dispatchers command and specifying the protocol to which to add dispatchers, and the number of dispatchers to add to that protocol. The number of dispatchers added for any protocol may not go over the total number of dispatchers allowed for all protocols, as represented by the MTS_MAX_DISPATCHERS initialization parameter. If the DBA wishes to increase number of MTS_DISPATCHERS over the value of MTS_MAX_DISPATCHERS, then the DBA must increase the value set for MTS_MAX_DISPATCHERS as well. Configuring Shared Servers These are the processes that obtain data from the Oracle database on behalf of the user process. The number of shared servers on the database is determined by the MTS_SERVERS initialization parameter. The number of shared servers allowed on the system is determined by the MTS_MAX_SERVERS parameter, which defaults to 20. It is generally best to set MTS_SERVERS low, because Oracle will automatically add shared servers if the average wait time for a shared server by a user process goes up rapidly or consistently. The performance of shared servers is monitored by the DBA by viewing statistics collected by the V$QUEUE performance view. The average wait time is calculated by dividing the value in the WAIT column by the value in the TOTALQ column where the value in the TYPE column equals 'COMMON'. If this statistic goes up consistently and the maximum number of shared servers has been reached, the DBA can increase the value for MTS_MAX_SERVERS in an attempt to alleviate high average wait times for shared servers.