Download Understanding Common Oracle Wait Event - DOUG

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Entity–attribute–value model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Oracle Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Transcript
Understanding
Common Oracle Wait
Events
Kirtikumar Deshpande
Dallas Oracle User Group
September 15, 2005
1
About Me

Senior Oracle DBA

Verizon Information Services

Phone Directories Publication
2
About OWI Book:
“…Where this book excels is bridging the gap
between the perfect measurement and
implementable solutions by explaining why's
and how's of the problem. I was amazed how
the authors put together rational explanations
of common wait events like latch free,
bolstered by the elaboration of internals like
hash buckets, cache buffer chains and how to
rectify those - it all seems so simple when it
comes out in the book. Whether you are a
veteran DBA who have seen all the battles
since the Civil War or a rookie just starting
out, this book is for you, a vital weapon in
your arsenal, especially the scripts for
identifying trouble spots. If I'm allowed to
keep only one book on Oracle - this will be it.”
- Arup Nanda
3
About OWI Book:
“I received this book on Tuesday and I
literally could not put it down. I
consumed it like a great thriller. It
contains a great deal of information that
can be found no where else in print.
Performance monitoring and tuning with
the Oracle Wait Interface is still new to
many Oracle DBAs, even seasoned ones,
precisely because the details of how to
gather and interpret the information have
been difficult to come by until now...”
– John Smiley
4
About OWI Book:
“The book is simply spectacular, both for the quality
of its writing as well as the depth of the material.
Practical? Indeed, indispensible!
The three authors, Richmond Shee, Kirti Deshpande,
and K Gopalakrishnan, have done a wonderful job in
organizing a enormous subject area into manageable
chunks. They have also managed to render
potentially bone-dry source material into very
readable text, interspersed heavily with code
examples and output, sidebars, and analogies. This is
a good read as well as an authoritative reference.
…. The combined efforts of the three authors and the
four technical editors blows my mind.
…. All I can say is - get it! ”
- Tim Gorman
5
Acknowledgement

Special thanks to Richmond Shee for
allowing me to use the contents of his
presentation at IOUG Live! 2005
6
Agenda
•
•
OWI Monitoring and Data capture
Handling Common Oracle Wait Events –
Going Beyond P1, P2, and P3
Intermediate level:
OWI Monitoring and Data Collection
Novice level:
Separating Symptoms from Problems
Beginner level:
Event Attributes (P1, P2, P3),
Event Classification
OWI Views
New Convert:
Paradigm Shift
7
OBJECTIVE and SCOPE
Take Home
Information you can use to discover the root cause of
performance problems and answer the 64-thousand dollar
questions:
Why did the job run so slowly?
Why did the job run so quickly?
Scope
OWI Monitoring: Oracle7 to Oracle9i Database
Handling OWI: Oracle7 to Oracle Database 10g
8
First agenda item…
OWI Monitoring and Data Capture
9
OWI Monitoring and Data Capture
Q1) Why is historical performance data important?
Q2) What is the best source of performance data?
– V$SYSTEM_EVENT?
– V$SESSION_EVENT?
– V$SESSION_WAIT?
Q3) What is a good data capture method and sampling frequency?
– Trace event 10046?
– Statspack?
10
OWI Monitoring and Data Capture
The importance of historical
performance data…
Blame the
database
Unaware
Job runs
slow
Ask DBA
what's wrong
with the DB
Aware
Blame
others
• Users expect their DBAs to be omniscient
• DBAs are expected to be aware of performance issues 24x7
• You need a history of all foreground processes ran in the instance
11
OWI Monitoring and Data Capture
Determine the best source of data…
V$SYSTEM_EVENT
V$SESSION_EVENT
Pros:
Pros: session-level granularity
Cons: system-level data
Cons: session-level granularity
V$SESSION_WAIT (X$KSUSECST)
Pros: Fine-grain data
Cons:
– Changes quickly, High volume of data
– Data requires translation
12
OWI Monitoring and Data Capture
Determine the best data capture method and sampling frequency…
Requirement:
• A performance data collector that is capable of monitoring all
foreground processes on a 24x7 basis.
Desired features:
• Wait-based philosophy
• Low overhead
• Always-on
• Repositories (wait events, runtime statistics, SQL statements,
and SQL plans)
13
OWI Monitoring and Data Capture
Consider the trace event 10046…
• Oracle’s most comprehensive trace facility.
• It captures wait events, SQL statements, bind variables.
• Fine-grain data is best for troubleshooting, but requires a lot of disk space.
• Disk space and overhead limitations prevent instance-wide monitoring
• Trace file
…is not user friendly:
WAIT #12: nam='db file scattered read' ela= 0 p1=106 p2=60227 p3=8
…does not have cross referencing:
WAIT #1: nam='enqueue' ela= 3007483 p1=1415053318 p2=393259 p3=149
…can have bugs:
WAIT #0: nam='db file parallel write' ela= 2 p1=-144 p2=1 p3=0
• It may add significant overhead to the RDBMS and further degrade the performance
of an already slow running process.
• Documentation for interpreting trace file is seldom available
14
OWI Monitoring and Data Capture
Summary - Trace event 10046
Criteria
10046
Trace
Wait-based Methodology
Yes
Low overhead
No
Monitor every process 24x7
No
Repository: Wait Event
No
Repository: Process Runtime
No
Repository: SQL Statement
No
Repository: SQL Plan
No
Granularity of Performance
data
Fine-Grain
15
OWI Monitoring and Data Capture
Consider the database logoff trigger…
• Excellent for session-level summary.
• Great for benchmarking.
• Instance-wide monitoring capability.
• Trigger overhead depends mainly on the code.
• Some PL/SQL coding is necessary.
• Disk space requirement is generally low – depends on the logoff rate.
• Only available in Oracle8i and later versions.
• Not suitable for root cause analysis which requires fine-grain data.
16
OWI Monitoring and Data Capture
An application of the database logoff trigger…
CATEGORY
SUBCATEGORY WAIT_EVENT
VALUE
PERCENT
---------- ----------- ----------------------------- ---------- ---------CPU
DISK I/O
LATENCY
OTHER
Fetch, Execute, Lookups, etc
41089
34.37
PARSE
parse time cpu
14139
11.83
RECURSIVE
recursive cpu usage
552
.46
DIRECT I/O
direct path read
0
0
Direct path write
0
0
854
.71
FULL SCANS
db file scattered read
NORMAL I/O
db file sequential read
3645
3.05
COMMITS
log file sync
1469
1.23
FILE OPS
file open
1
0
LATCH
latch free
57470
48.07
LOG FILE
log file switch completion
6
.01
NETWORK
SQL*Net message to client
179
.15
2
0
36
.03
SQL*Net more data from client
SQL*Net more data to client
MISC
OTHER
buffer busy waits
107
.09
MISC
library cache pin
2
0
17
OWI Monitoring and Data Capture
Summary – Database logoff trigger
Criteria
10046
Trace
Database
Logoff Trigger
Wait-based Methodology
Yes
Yes
Low overhead
No
Yes **
Monitor every process 24x7
No
Yes
Repository: Wait Event
No
Yes
Repository: Process Runtime
No
Yes
Repository: SQL Statement
No
No
Repository: SQL Plan
No
No
Granularity of Performance
data
Fine-Grain
Coarse
18
OWI Monitoring and Data Capture
Consider Statspack…
• Report has a lot of information that allows you to examine performance
from several perspectives.
• Instance-level snapshots offer coarse-grain information that roughly
indicates there is a problem but not specifically where the problem is
- No different than querying v$system_event, v$sysstat, v$latch, etc.
• Session-level snapshots? How are you going to automate it? Even if
session-level snapshot automation is not an issue, the data is still too
coarse.
- No different than querying v$session_event and v$sesstat.
• Difficulty in determining the best sampling frequency.
19
OWI Monitoring and Data Capture
Summary – Statspack
Criteria
10046
Trace
Database
Logoff
Trigger
Statspack
Wait-based Methodology
Yes
Yes
Yes
Low overhead
No
Yes **
Yes
Monitor every process 24x7
No
Yes
No
Repository: Wait Event
No
Yes
Yes
Repository: Process Runtime
No
Yes
Yes
Repository: SQL Statement
No
No
Yes
Repository: SQL Plan
No
No
Yes
Granularity of Performance
data
Fine-Grain Coarse
Coarse
20
OWI Monitoring and Data Capture
Problem: There is no free suitable tool available.
Prior to Oracle Database 10g, you have to develop your own tool or purchase very
expensive 3rd party tools.
Statspack
Wait-based Methodology
Yes
Yes
Yes
Low overhead
No
Yes **
Yes
Monitor every process 24x7
No
Yes
No
Repository: Wait Event
No
Yes
Repository: Process Runtime
No
Yes
Repository: SQL Statement
No
No
Repository: SQL Plan
No
No
Granularity of Performance
data
Fine-Grain
Coarse
Yes
Yes
Yes
Yes
Too Coarse
Database
Logoff Trigger
Too Coarse
Criteria
10046
Trace
Coarse
21
OWI Monitoring and Data Capture
BYOT: Build Your Own Tool (using PL/SQL to capture data)
Three major areas to consider:
• Sampling frequency
• Repository
• Events to monitor
22
OWI Monitoring and Data Capture
BYOT: Build Your Own Tool (using PL/SQL to capture data)
Data Source:
• V$SESSION_WAIT (X$KSUSECST)
Sampling frequency:
• Affects the quantity and granularity of data
• Depends on data capture method
• Unix Shell script
• PL/SQL procedure
• Unix Cron
• SNP background process
23
OWI Monitoring and Data Capture
BYOT: Build Your Own Tool (using PL/SQL to capture data)
Repositories: Minimum two repositories (wait events & SQL code)
Wait Event
is associated with
SQL
Statement
is associated with
SQL Plan
• SQL statements help set the context and get you closer to the problem.
Event: Buffer busy waits
P1 & P2 = FOOBAR table P3 = 220
• Also helps developers to locate the right module.
24
OWI Monitoring and Data Capture
BYOT: Build Your Own Tool (using PL/SQL to capture data)
Events to monitor:
Events to ignore:
db file sequential read
KXFX: Execution Message Dequeue – Slave
PX Deq: Execution Msg
KXFQ: kxfqdeq - normal deqeue
PX Deq: Table Q Normal
Wait for credit - send blocked
PX Deq Credit: send blkd
Wait for credit - need buffer to send
PX Deq Credit: need buffer
Wait for credit - free buffer
PX Deq Credit: free buffer
parallel query dequeue wait
PX Deque wait
Parallel Query Idle Wait – Slaves
PX Idle Wait
dispatcher timer
virtual circuit status
slave wait
pipe get
rdbms ipc message
rdbms ipc reply
pmon timer
smon timer
WMON goes to sleep
client message
SQL*Net message from client (* debatable)
Null event (* debatable)
db file scattered read
latch free
direct path read
direct path write
Enqueue
library cache pin
buffer busy waits
free buffer waits
PL/SQL lock timer
25
OWI Monitoring and Data Capture
BYOT: Build Your Own Tool (using PL/SQL to capture data)
• 24x7 monitoring.
• Wait event history.
–Immediate answer to why a certain process runs like molasses.
–Proactive performance management.
• SQL statement and plan repositories.
• Jobs elapsed time can be determined from the sampling intervals.
• Low disk space requirement.
• Extensive PL/SQL coding.
• Overhead depends on the quality of code.
• Not suitable for short-running jobs.
26
OWI Monitoring and Data Capture
Summary – PL/SQL procedure
Criteria
10046
Trace
Database
Logoff Trigger
Statspack
PL/SQL
Procedure
Wait-based Methodology
Yes
Yes
Yes
Yes
Low overhead
No
Yes **
Yes
Yes **
Monitor every process 24x7
No
Yes
No
Yes
Repository: Wait Event
No
Yes
Yes
Yes
Repository: Process Runtime
No
Yes
Yes
No
Repository: SQL Statement
No
No
Yes
Yes
Repository: SQL Plan
No
No
Yes
Yes
Granularity of Performance
data
Fine-Grain
Coarse
Coarse
Near finegrain
27
OWI Monitoring and Data Capture
Chapter 4 contains a
detailed discussion of
OWI monitoring and data
capture
28
Second agenda item…
Handling Common Oracle Wait Events
• Going beyond P1, P2, and P3
29
Handling Wait Events
db file sequential read
db file scattered read
At what point do these wait events become a problem?
What are they a symptom of?
a)
b)
c)
d)
e)
Low cache hit ratio
Slow I/O subsystem
Physical I/O calls
Small block size
Small buffer cache
Handling these wait events requires
you to know:1.
The amount of time the events
are costing the process.
2.
The SQL statement that is
associated with the events.
Solution: SQL tuning
30
Handling Wait Events
Latch Free
Latch Free contention is a symptom of?
a)
Low SPIN_COUNT.
b)
Inefficient SQL statements.
c)
Concurrency coupled with high demands for resources.
d)
Insufficient number of latches.
e)
Insufficient or slow CPU.
Handling the latch free contention requires you to know:1.
The type of latch sessions are competing for (28 individual latch
wait events in Oracle10g Release 1).
2.
The amount of time a session spent waiting on latches.
3.
The SQL statement that is associated with the event.
31
Handling Wait Events
Latch Free: Shared Pool & Library Cache
Contention for the Shared Pool & Library Cache latch is a symptom of?
a) Hard parses – literal SQL statements.
b) Soft parses.
c) Oversized shared pool.
d) High version count.
e) Bad application
Solution: If not (c), the real solution is correcting Application
Workarounds:

Set CURSOR_SHARING = FORCE

Set SESSION_CACHED_CURSORS
32
Handling Wait Events – Latch Free: Cache Buffers Chains
A Working Set
LRU
LRUW
Hash Latch
Hash Bucket
Buffer Header
Hash Chain
Buffers Memory
33
Handling Wait Events
Latch Free: Cache Buffers Chains
Contention for the CBC latch is symptomatic of?
a)
Inefficient SQL statement.
b)
Hot blocks.
c)
Long hash chains.
d)
Insufficient number of latches.
Handling the CBC latch contention requires you to know:
1.
If the contention is widespread or localized to a particular latch.
2.
The SQL statements that participate in the competition.
34
Handling Wait Events
Latch Free: Cache Buffers Chains
Solutions:

Tune the application and SQL statements.

Reduce the level of concurrency.
Workarounds:

Spread the hot blocks across multiple CBC latches.

Consider increasing _SPIN_COUNT (Oracle9i and above, use
_LATCH_CLASS and _LATCH_CLASSES).

Consider increasing _DB_BLOCK_HASH_BUCKETS.

Consider increasing _DB_BLOCK_HASH_LATCHES.
35
Handling Wait Events
Buffer Busy Waits
BBW contention is a symptom of?
a)
Read/read, read/write, or write/write contention.
b)
Corrupted buffer pin.
c)
Insufficient INITRANS.
d)
Large block size.
Handling the BBW contention requires you to know:1.
The amount of time a session spent waiting on the event.
2.
The reason code that represents why a process fails to get a buffer pin.
3.
The class of block that the buffer busy waits event is for.
4.
The SQL statements that are associated with the event.
5.
The segment that the buffer belongs to.
36
Handling Wait Events
BBW: Solutions depend on the class of block and reason code:
BBW contention for data block class (class #1), reason code 130

Reduce the level of concurrency or change the way the work is
partitioned between the parallel threads.

Optimize the SQL statement to reduce the number of physical and
logical reads.

Increase the number of FREELISTS and FREELIST GROUPS.
BBW contention for data block class (class #1), reason code 220

Reduce the level of concurrency or change the partitioning method.

Reduce the number of rows in the block.

Rebuild the object in another tablespace with a smaller block size
(Oracle9i and above).
37
Handling Wait Events
BBW : Solutions depend on the class of block and reason code:
BBW contention for data segment header (class #4)

Increase the number of FREELISTS and FREELIST GROUPS of the
identified object.

Ensure the gap between PCTFREE and PCTUSED is not too small.

Ensure the next extent size is not too small.
BBW contention for undo segment header (class #17)

**Applies to rollback segment, not the system-managed undo.

Create additional rollback segments.

Ensure the next extent size is not too small.
BBW contention for undo blocks (class #18)

Application tuning.
38
Handling Wait Events
Free Buffer Waits
Free Buffer Waits wait is symptomatic of?
a)
Small buffer cache.
b)
Insufficient number of DBWR processes.
c)
Inefficient SQL statement.
d)
Slow I/O subsystem.
e)
Delayed block cleanout.
Handling the Free Buffer Waits event requires you to know:1.
The amount of time a session spent waiting on the event.
2.
The SQL statements that are associated with the event.
3.
The number of DBWR processes.
4.
The I/O operation and database storage system.
39
Handling Wait Events
Free Buffer Waits
Solutions:

Optimize the SQL statements.

Increase the number of DBWR processes.

Use appropriate I/O operation (async or sync).

Lower the FAST_START_MTTR_TARGET value.

Reduce the buffer cache size.

Increase the buffer cache size.

Pre-scan the table after each load.
40
Handling Wait Events
Log File Sync
Log File Sync wait is symptomatic of?
a)
Oversized log buffer.
b)
High commit frequency.
c)
Bad application.
d)
Slow LGWR process.
Handling the Log File Sync event requires you to know:1.
The amount of time a session spent waiting on the event.
2.
The type of job (batch or OLTP) that is associated with the event.
Solution:

Reduce the commit frequency.
Workarounds:

Reduce the log buffer size or lower the _LOG_IO_SIZE.

Increase LGWR I/O throughput.
41
Handling Wait Events
Enqueue
Enqueue contention is symptomatic of?
a)
Concurrent access to the DBMS_AQ package.
b)
Concurrent transactions with incompatible lock requests for a
database resource.
c)
Concurrent transactions with incompatible lock requests for a
latch.
d)
Poor application design.
Handling the Enqueue contention requires you to know:1.
The type and mode of enqueue the sessions are competing for
(All enqueues have independent wait event names in Oracle
Database 10g).
2.
The amount of time a session spent waiting on enqueues.
3.
The SQL statement that is associated with the event.
42
Handling Wait Events
TX enqueue in mode 6 (Exclusive)
Contention for the TX enqueue in mode 6 is for row-level locks.
In Oracle Database 10g, this is “enq: TX – row lock contention”.
Solutions:

Commit or rollback the transaction holding the lock.

Fix the application so that sessions don’t go after the same rows.
Workaround:

None
43
Handling Wait Events
TX enqueue in mode 4 (Share)
Contention for the TX enqueue in mode 4 can be due to:

ITL shortage
- In Oracle Database 10g: “enq: TX – allocate ITL entry”)

Unique key enforcement

Bitmap index entry
Solution depends on the object of contention:

Increase the number of INITRANS.

Prevent multiple sessions from inserting the same key
value into a table.

Don’t use bitmap indexes.
44
Handling Wait Events
TM enqueue in mode 3,4,5 (Row-X, Share, Share Row_X)
Contention for the TM enqueue in mode 3,4,5 is normally due to
non-indexed foreign key columns.
Solution:

Index the foreign key columns of the object identified by
the TM enqueue.
45
Handling Wait Events
Chapters 5, 6, and 7 contain
a detailed discussion of how
to handle common Oracle
wait events.
46
Handling Wait Events
Do you think you can use
the information presented
in this session to identify
performance bottlenecks?
47
Understanding
Common Oracle Wait
Events
Q&A
[email protected]
48