Download Tuning Oracle RAC

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Distributed operating system wikipedia , lookup

CAN bus wikipedia , lookup

Computer cluster wikipedia , lookup

Transcript
Tuning Oracle RAC
Guy Peleg
President
Maklee Engineering
[email protected]
Friday, May 12, 2017
Agenda
RAC overview & Performance Expectations
Performance Tips
SQL Tuning
Typical RAC configuration
App
Server1
App
Server2
Client
Public Network
Database
Instance 1
Private Network
(Interconnect)
Node 1
Database
Instance 2
Node 2
Storage Network
Local Storage
Local Storage
Shared
Database
Oracle RAC
Oracle RAC provides two main features:
Availability
Scalability
May operate in two modes:
All nodes are active (load distributed between nodes)
Active/Passive
RAC scaling/performance considerations are similar to
OpenVMS clustering scaling/performance considerations
Interconnect
Locks
Sharing
RAC Scaling – Maklee’s Golden Rules
Application that does not scale on a standalone node
– will not scale on RAC
Start with single instance tuning
shutdown all nodes measure scaling
test scaling by adding CPUs
Add one node at a time to measure scalability
Scalability Benchmark
2 nodes cluster
1.3 Ghz rx2600, running OpenVMS V8.3-1H1
Oracle 10gR2 RAC
Latest set of patches
Test database contains information about
50,000 customers
200,000 customer orders
200,000 ordered items
Scalability Benchmark
PL/SQL procedure to fetch data about 2000 random
customers
Read only test
All data in SGA
No I/O
CPU Bound
Scalability Benchmark
1400
1200
1 Job
1000
800
600
400
200
6 Parallel jobs one instance
3 Parallel jobs
instance 1
3 Parallel jobs
instance 2
0
Elapsed time (seconds per job) to complete the test
Less is better
RAC Proof Of Concept
MAKLEE Engineering recently performed a RAC proof of
concept installation at a large chain of department stores in
Switzerland.
Benchmarked a single Alpha GS1280 (production node) vs. a
RAC cluster running 2 Integrity servers rx6600.
The goals were:
Install RAC
Get hands on experience with RAC
Perform RAC scaling tests
Make a go/no go decision on implementing RAC in production
Hardware & Software Configuration
Oracle RAC Configuration:
2 nodes OpenVMS Cluster
Each node is rx6600 with 8 cores
OpenVMS V8.3-1H1
EVA8000 storage
Products installed:
Oracle CRS (Cluster Ready Services)
Oracle 10g R2
DBCA executed for configuring RAC enabled database
Database patches
27 Parallel Database Import Jobs
160
Standalone GS1280
140
120
rx6600
RAC/standalone
Itanium
100
80
60
40
20
0
Minutes to complete database import
less is better
Database Import
Itanium outperformed Alpha
Operating in RAC environment does not increase the
throughput of the import operation
Spreading the jobs across two nodes or running all
jobs on one node yields identical
performance/throughput
No performance degradation witnessed
Batch Processing Benchmark
45
54 jobs - Single
Alpha GS1280
40
35
54 jobs - Single
rx6600
30
54 jobs - Spread
across the RAC
25
20
15
10
5
0
Minutes to complete batch processing cycle
Less is better
Batch Processing Benchmark
Itanium outperformed Alpha
RAC allows scaling outside of the box
Second RAC node adds 40% more throughput
Another Example – European Bank
European Bank migrating from Alpha to Itanium
2 nodes AlphaServer ES47 -> 2 nodes rx7640
Migrating to Oracle 10gR2 RAC
Availability is main concern
Interactive users will be distributed between nodes
No plans to distributed batch load between nodes
Needed to verify that RAC does not degrade performance
Another Example – European Bank
Benchmarked various batch jobs – focusing on one
specific batch job.
Initial results did not favor Itanium.
Batch Processing Benchmark
35
Alpha ES47
30
25
rx7640 out of the
box
20
rx7640 after tuning
15
10
5
0
Minutes to complete selected batch job
Less is better
European Bank - Summary
Tuning is critical for achieving optimal performance
Don’t run “out of the box”.
66% improvement after (minimal) tuning
The specific benchmark is running 52% faster on
Itanium comparing to Alpha.
European Bank - Summary
All other batch jobs/applications witnessed similar
improvement.
RAC increases availability and does not degrade
performance.
RAC will go into production in few weeks
Performance Tips
CRS Base Priority
CRS is running in batch
Usually, runs in a dedicated batch queue
By default, base priority of a batch queue is 4
On a system with thousands of processes, CRS may need to compete (and
sometimes lose) for CPU resources
CRS should be given high priority
Set base priority of CRS queue to 12
RAC Cluster Interconnect
The performance of the cluster interconnect is critical to the performance
of the RAC.
Interconnect used for
Cluster management
Locks
Cache Fusion
Oracle requires (at least one) dedicated cluster interconnect
Gigabit Ethernet is highly recommended
Enable Jumbo Frames
Transfer rate of ~ 25MB per second (faster than some disks ;-)
Cluster interconnect Performance
Latency is CRITICAL for RAC performance
Measure the latency of the interconnect:
set numwidth 20
column "AVG CR BLOCK RECEIVE TIME (ms)" format 9999999.9
select
b1.inst_id,
b2.value "GCS CR BLOCKS RECEIVED",
b1.value "GCS CR BLOCK RECEIVE TIME",
((b1.value/b2.value) * 10) "AVG CR BLOCK RECEIVE TIME (ms)"
from gv$sysstat b1,
gv$sysstat b2
where b1.name='gc cr block receive time'
and b2.name='gc cr blocks received'
and b1.inst_id=b2.inst_id;
Cluster interconnect Performance
Latency should be lower than 15ms
OpenVMS achieved 0.5ms on
blades RAC (BL860)
V8.3-1H1
Gigabit Ethernet
Jumbo Frames enabled
Load distribution between instances
set pagesize 60 space 2 numwidth 8 linesize 132 verify off
feedback off
column service_name format a20 truncated heading 'Service'
column instance_name heading 'Instance' format a10
column service_time heading 'Service Time|mSec/Call' format
999999999
select service_name,
instance_name,
elapsedpercall service_time,
cpupercall cpu_time,
dbtimepercall db_time,
callspersec throughput
from gv$instance gvi,
gv$active_services gvas,
gv$servicemetric gvsm
where gvas.inst_id=gvsm.inst_id
and gvas.name_hash=gvsm.service_name_hash
and gvi.inst_id=gvsm.inst_id
and gvsm.group_id=10
order by
service_name,
gvi.inst_id;
Standalone Database Import
8
rx6600 before
tuning
7
6
5
rx6600 after tuning
4
3
2
1
0
Minutes to complete database import
less is better
Database import
Install imp.exe as resident image with shared address space
$ install add imp.exe/resident/share=addr
Increase default quotas for BEQ’s mailboxes
$ define/sys ORA_BEQ_MBXSIZ 64000
$ define/sys ORA_BEQ_MBXSBFQ 64000
Set DEFMBXBUFQUO to 64000
Set DEFMBXMXMSG to 64000
DBMS_STATS.GATHER_SCHEMA_STATS
100
90
80
70
60
50
40
30
20
10
0
rx6600 before
tuning
rx6600 after tuning
Minutes to gather database statistics (350GB database)
Less is better
DBMS_STATS.GATHER_SCHEMA_STATS
Calling gather_schema_stats results in a database server
process being created
The server process in not multithreaded
Typically consumes 100% of one CPU
Performance improvement achieved by affinitizing the server
process to one CPU and increasing QUANTUM to 20.
SORT
Analyze the efficiency of sort operations
Determine the number of optimal, one pass and multipass operations
SELECT optimal_count, round(optimal_count*100/total, 2)
optimal_perc,
onepass_count, round(onepass_count*100/total, 2)
onepass_perc,
multipass_count, round(multipass_count*100/total, 2)
multipass_perc
FROM
(SELECT decode(sum(total_executions), 0, 1,
sum(total_executions)) total,
sum(OPTIMAL_EXECUTIONS) optimal_count,
sum(ONEPASS_EXECUTIONS) onepass_count,
sum(MULTIPASSES_EXECUTIONS) multipass_count
FROM v$sql_workarea_histogram
WHERE low_optimal_size > 64*1024);
Sizing the SGA
Reserve memory for the SGA (SYSMAN)
Avoid automatic memory management in the SGA
whenever possible.
The following query will help properly size the SGA
select sga_size, sga_size_factor as size_factor,
estd_physical_reads as estimated_physical_reads
from v$sga_target_advice order by sga_size_factor;
Sizing the SGA
SQL> select sga_size, sga_size_factor as size_factor,
2 estd_physical_reads as estimated_physical_reads
3 from v$sga_target_advice order by sga_size_factor;
SGA_SIZE SIZE_FACTOR ESTIMATED_PHYSICAL_READS
---------- ----------- -----------------------4356
,75
44485808
5808
1
24659539
7260
1,25
24659539
8712
1,5
24659539
10164
1,75
24659539
SQL>
What’s wrong in this picture?
$ show memory
System Memory Resources on
1-APR-2008 15:32:35.62
Physical Memory Usage (bytes):
Main Memory
(GB)
Total
64.00
Free
58.27
In Use
5.69
Modified
0.02
Extended File Cache (Time of last reset: 31-MAR-2008 15:14:46.99)
Allocated (MBytes)
397.03
Maximum size (MBytes)
32768.00
Free (MBytes)
17.82
Minimum size (MBytes)
3.12
In use (MBytes)
379.20
Percentage Read I/Os
77%
Read hit rate
99%
Write hit rate
0%
Read I/O count
5368075
Write I/O count
1578011
Read hit count
5315683
Write hit count
0
Reads bypassing cache
79
Writes bypassing cache
241954
Files cached open
739
Files cached closed
2255
Vols in Full XFC mode
0
Vols in VIOC Compatible mode
52
Vols in No Caching mode
0
Vols in Perm. No Caching mode
0
....
Of the physical memory in use, 8.52 GB are permanently allocated to OpenVMS.
$
SQL Tuning
The next step in improving performance
SQL Tuning !
With previous “Alpha Vs. Itanium” benchmarks we
had to play it fare
Not a single SQL statement was changed.
SQL tuning may improve performance by magnitudes
SQL Tuning
All the tools that are required for SQL tuning are shipping with
the database:
Automatic Workload Repository (AWR)
Endless amount of performance related information
Enhanced version of statpak
Active Session History (ASH)
Automatic Database Diagnostic Monitor (ADDM)
SQL Access Advisor
SQL Tuning Advisor
Statspack analyzer (not part of the DB but available for free)
The power of SQL tuning
AWR was used to analyze the “scalability benchmark”
97% of the time was spent executing single SQL statement
After SQL tuning – elapsed time of the benchmark was reduced
from 411 seconds to 3.18 seconds !
130 times
450
400
350
300
250
200
150
100
50
0
Untuned
version
Tuned version
faster!!!!
The power of SQL tuning
“Real life” example
rx6600, Oracle 10g, DWH DB
Single SQL statement required 140 minutes to complete
By biasing the optimizer, elapsed time reduced to 10 minutes
140
120
100
80
60
40
20
0
Untuned
version
Tuned version
Questions?
See us at www.maklee.com for:
• Performance improvements
• Oracle Tuning
• Platform Migration
• Custom Engineering solutions
• Custom Training