Download Oracle World 2004 Failover Presentation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
OpenVMS Solutions Center Lab
Project - Spring 2004 :
Oracle 9i RAC DT/HA in a distributed
OpenVMS Environment
Phase I – Failover
RAC DT/HA – Goals – Phase I

First:


Demonstrate that Oracle 9iRAC
continues to run during simulated
network failure using LAN Failover and
failSAFE IP configurations.
Second:
 Measure the latency effect of failover
when RAC instances are connected
over long distance (100km).
RAC DT/HA – What is Failover?
 Oracle
RAC failover: The ability to resume work
on an alternate instance upon instance failure
 Oracle
TAF (Transparent Application Failover):
Runtime failover which enables client
applications to automatically reconnect to the
database if the connection fails
 LAN
Failover: Hardware failover from failed
network interface card (NIC) to another NIC
configured as part of LAN failover set
 failSAFE
IP: Address failover to alternate
interfaces
RAC DT/HA – Hardware Config
2
4-cpu GS160, with Shared Cluster
System disk, a Shared Oracle install disk
on Enterprise Storage Array connected via
Fibre SAN A Switch
 DE602-AA (EIA) NIC’s, using Twisted Pair
on 100m-bit LAN Extreme Summit4 Switch
 5-DEGPA-SA, 1-DEGXA-SA (EWA-D)
NIC’s, 1Gbit fiber on 1Gbit LAN Digital
Networks DNSwitch 800
 100km cable - Gbit SCS Extreme Summit
7i Switch
RAC DT/HA – Server Config
 OpenVMS
7.3-2, TCPIP 5.4
 Oracle
Server 9.2.0.4, with Oracle patch
for bug fix 3026720: Excessive CPU and
BUFIO for LMD0 and SMON processes when
>2cpu
 Running
2 RAC instances, in 2 node
cluster
 Requires
the INIT<SID>.ORA parameter
CLUSTER_INTERCONNECTS to specify
alternate network interface for RAC
communication
RAC DT/HA – Client Config
 9.2
SQLNet Client, on PC running Windows
2000
 Benchmark/Load
•
•
Generating software:
Swingbench 2.1f- An ‘unofficial’, Java based,
client load generating tool from Oracle, which
allows a ‘load’ to be generated and the
transactions/response times to be charted
Configured to connect 100 clients, load
balanced between the 2 instances, and run
50,000 ‘typical’ Order Entry transactions
RAC DT/HA – Test Plan
 Restore
from disk backup before each test
run to ensure same starting point
 Ensure
RAC instances communicating
over specified network interface
 Run
3 iterations of same benchmark load
while collecting data
• Run Benchmark load, no failures
• Run Benchmark load, fail instance
• Run Benchmark load, fail network
connection between instances
RAC DT/HA – Data collection
 T4
running on both nodes, 10sec sampling interval
 Saved
Swingbench data results after each run
 Executed
and ‘saved’ output of VMS commands
during network failures to see status of network
devices and Oracle processes
$ MC LANCP SHOW DEVICE/CHARATERISTICS LLA0
$ TCPIP SHOW INTERFACES/FULL
$ PIPE SHO SYS|SEA TT: ORA_CPU
Tabular Timeline Tracking Tool – T4


Created by OpenVMS Sustaining Engineers to help
diagnose OS functionality. Uses OpenVMS Monitor data,
stored in Comma Separated Value file format (.csv file),
which can then be used by a variety of applications
(spreadsheets, TlViz, etc)
Download from web. Shipped with OpenVMS 7.3-2, in
SYS$ETC directory




http://h71000.www7.hp.com/openvms/products/t4/index.html
Users are able to queue data collection and configure data
collection frequency
Helpful in establishing baseline performance footprint
which can then be used in before and after comparisons of
system changes
T4 ‘hooks’ for Oracle and Rdb Server being created
RAC DT/HA – EIA Network
GS160 - QBB0
Oracle RAC
network
connection using
EIA device
EIA0 161.114.69.7
EVA
Common
System Disk
100 M-bit Lan
Extreme Summit 4
Switch
Shared
Oracle 9i
Fiber San A Switch
UTP Ethernet
PC Swingbench Client 1
PC Swingbench Client 2
.
.
.
Database
EIA0 161.114.69.8
GS160 - QBB3
PC Swingbench Client 100
RAC DT/HA – T4 data - EIA
EIA0 - Baseline
1,400,000
1,400,000
1,300,000
1,300,000
1,200,000
1,200,000
1,100,000
1,100,000
1,000,000
1,000,000
900,000
900,000
800,000
800,000
700,000
700,000
600,000
600,000
500,000
500,000
400,000
400,000
300,000
300,000
200,000
200,000
100,000
100,000
16:20:00
(26-Mar-2004)
16:25:00
(26-Mar-2004)
16:30:00
(26-Mar-2004)
16:35:00
(26-Mar-2004)
16:40:00
(26-Mar-2004)
b [NET.EIA0:]Bytes Recv/Sec(# 1)
c
d
e
f
g
Node QBB0
16:45:00
(26-Mar-2004)
RAC DT/HA - LAN Failover Network
GS160 - QBB0
Oracle RAC
connection using
LLA0 device
for LAN Failover
EIA0 161.114.69.7
EWA0
EVA
EWB0
LLA0 10.3.3.1
Shared
Oracle 9i
Fiber San A Switch
G-Bit LAN Digital Networks
DNswitch 800
100 M-bit Lan
Extreme
Summit 4
Switch
LLA0 10.3.3.2
EWA0
Database
UTP Ethernet
Common
System Disk
PC Swingbench Client 1
PC Swingbench Client 2
.
.
.
EWB0
PC Swingbench Client 100
EIA0 161.114.69.8
GS160 - QBB3
RAC DT/HA – LAN Failover DCL
$ MCR LANCP SHOW DEVICE/CHAR LLA0
Before NIC ‘fails’
Device Characteristics LLA0:
Value
-----256
Yes
.
.
1000
"EWB0"
"EWA0"
.
.
0
Characteristic
-------------Max receive buffers
Full duplex enable
.
.
Line speed (mbps)
Failover device
Failover device (active)
.
.
Failover priority
After NIC ‘fails’
Device Characteristics LLA0:
Value
-----256
Yes
.
.
1000
"EWB0"
"EWA0"
.
.
0
Characteristic
-------------Max receive buffers
Full duplex enable
.
.
Line speed (mbps)
Failover device (active)
Failover device
.
.
Failover priority
RAC DT/HA-T4 LAN Failover EWA/B
LAN Failover - Pull Cable
1,900,000
1,900,000
1,800,000
1,800,000
1,700,000
1,700,000
1,600,000
1,600,000
1,500,000
1,400,000
1,300,000
1,200,000
EWA0
cable
pulled
EWB0
cable
pulled
1,500,000
1,400,000
1,300,000
1,200,000
1,100,000
1,100,000
1,000,000
1,000,000
900,000
900,000
800,000
800,000
700,000
700,000
600,000
600,000
500,000
500,000
400,000
400,000
300,000
300,000
200,000
200,000
100,000
100,000
0
0
17:05:00
(7-Apr-2004)
17:10:00
(7-Apr-2004)
17:15:00
(7-Apr-2004)
17:20:00
(7-Apr-2004)
17:25:00
(7-Apr-2004)
17:30:00
(7-Apr-2004)
b [NET.EWA0:]Bytes Recv/Sec(# 1) g
c
d
e
f
g
b [NET.EWB0:]Bytes Recv/Sec(# 1)
c
d
e
f
Node QBB0
RAC DT/HA-T4 LAN Failover LLA0
LAN Failover - Pull Cable
1,900,000
1,900,000
1,800,000
1,800,000
1,700,000
1,700,000
1,600,000
1,600,000
1,500,000
1,500,000
1,400,000
1,400,000
1,300,000
1,300,000
1,200,000
1,200,000
1,100,000
1,100,000
1,000,000
1,000,000
900,000
900,000
800,000
800,000
700,000
700,000
600,000
600,000
500,000
500,000
400,000
400,000
300,000
300,000
200,000
200,000
100,000
100,000
17:05:00
(7-Apr-2004)
17:10:00
(7-Apr-2004)
17:15:00
(7-Apr-2004)
17:20:00
(7-Apr-2004)
17:25:00
(7-Apr-2004)
b [NET.LLA0:]Bytes Recv/Sec(# 1)
c
d
e
f
g
Node QBB0
17:30:00
(7-Apr-2004)
RAC DT/HA-T4 Overlay of
EWA/LLA0
LAN Failover - Pull Cable
1,900,000
1,900,000
1,800,000
1,800,000
1,700,000
1,700,000
1,600,000
1,600,000
1,500,000
1,500,000
1,400,000
1,400,000
1,300,000
1,300,000
1,200,000
1,200,000
1,100,000
1,100,000
1,000,000
1,000,000
900,000
900,000
800,000
800,000
700,000
700,000
600,000
600,000
500,000
500,000
400,000
400,000
300,000
300,000
200,000
200,000
100,000
100,000
0
0
17:05:00
(7-Apr-2004)
17:10:00
(7-Apr-2004)
17:15:00
(7-Apr-2004)
17:20:00
(7-Apr-2004)
17:25:00
(7-Apr-2004)
17:30:00
(7-Apr-2004)
b [NET.EWA0:]Bytes Recv/Sec(# 1) g
c
d
e
f
g
b [NET.EWB0:]Bytes Recv/Sec(# 1) g
c
d
e
f
b [NET.LLA0:]Bytes Recv/Sec(# 1)
c
d
e
f
Node QBB0
RAC DT/HA – failSAFE IP Network
GS160 - QBB0
Oracle RAC
connection using
EWD0/E0 devices
FailSafeIP
EIA0 161.114.69.7
EWA0
EVA
EWB0
PC Swingbench Client 1
10.4.4.1
Shared
Oracle 9i
Fiber San A Switch
G-Bit LAN Digital Networks
DNswitch 800
UTP Ethernet
Common
System Disk
100 M-bit Lan
Extreme
Summit 4
Switch
EWD0 10.4.4.2
Database
EWE0 10.4.4.3
.
.
.

10.4.4.2 &
10.4.4.3 are
configured for
FailSafeIP
EIA0 161.114.69.8
GS160 - QBB3
PC Swingbench Client 2
PC Swingbench Client 100
RAC DT/HA – failSAFE IP DCL
$ TCPIP SHOW INTERFACE/FULL
Route Tree for Protocol Family 2:
default
161.114.69.1
10.4.4/24
10.4.4.2
10.4.4/24
10.4.4.3
10.4.4.2
10.4.4.2
10.4.4.3
10.4.4.3
UGS
U
U
UHL
UHL
0
274
274
0
0
7999
408185
445714
0
14
IE0
WE3
WE4
WE3
WE4
WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX>
failSAFE IP Addresses:
inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE4)
*inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500
WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX>
failSAFE IP Addresses:
inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE3)
*inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500
RAC DT/HA – failSAFE IP DCL Failed 1
$ TCPIP SHOW INTERFACE/FULL
Route Tree for Protocol Family 2:
default
161.114.69.1
10.4.4/24
10.4.4.2
10.4.4/24
10.4.4.3
10.4.4.2
10.4.4.2
10.4.4.3
10.4.4.3
UGS
U
U
UHL
UHL
0
274
274
0
0
7999
408185
445714
0
14
IE0
WE3
WE4
WE3
WE4
WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX>
*failSAFE IP - interface is in a failed state
failSAFE IP Addresses:
inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE4)
*inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 (on QBB3 WE4)
WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX>
*inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500
inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63 ipmtu 1500
RAC DT/HA – failSAFE IP DCL Failed 2
$ TCPIP SHOW INTERFACE/FULL
Route Tree for Protocol Family 2:
default
161.114.69.1
10.4.4/24
10.4.4.2
10.4.4/24
10.4.4.3
10.4.4.2
10.4.4.2
10.4.4.3
10.4.4.3
UGS
U
U
UHL
UHL
0
274
274
0
0
7999
408185
445714
0
14
IE0
WE3
WE4
WE3
WE4
WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX>
*inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500
inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 ipmtu 1500
WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX>
*failSAFE IP - interface is in a failed state.
failSAFE IP Addresses:
inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63(on QBB3 WE3)
*inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 (on QBB3 WE3)
RAC DT/HA – T4 data failSAFE IP
FailSafeIP - Pull Cable
1,600,000
1,600,000
1,500,000
1,500,000
1,400,000
1,400,000
EWE0
cable
pulled
EWD0
cable
pulled
1,300,000
1,200,000
1,100,000
1,000,000
1,300,000
1,200,000
1,100,000
1,000,000
900,000
900,000
800,000
800,000
700,000
700,000
600,000
600,000
500,000
500,000
400,000
400,000
300,000
300,000
200,000
200,000
100,000
100,000
0
0
14:40:00
(14-Apr-2004)
14:45:00
(14-Apr-2004)
14:50:00
(14-Apr-2004)
14:55:00
(14-Apr-2004)
15:00:00
(14-Apr-2004)
15:05:00
(14-Apr-2004)
b [NET.EWD0:]Bytes Recv/Sec(# 1) g
c
d
e
f
g
b [NET.EWE0:]Bytes Recv/Sec(# 1)
c
d
e
f
Node QBB3
RAC DT/HA – 100km cable Network
Oracle RAC
connection using
EWA0 device
separated by
100km
GS160 - QBB0
EIA0 161.114.69.7
EWA0
EVA
PC Swingbench Client 1
Common
System Disk
100 M-bit Lan
Extreme
Summit 4
Switch
100km Fiber Cable
Shared
Oracle 9i
Fiber San A Switch
G-Bit SCS Extreme Summit 7i
UTP Ethernet
G-Bit SCS Extreme Summit 7i
PC Swingbench Client 2
.
.
.
Database
EWA0
PC Swingbench Client 100
EIA0 161.114.69.8
GS160 - QBB3
RAC DT/HA – T4 EWA0 w/100km cable
EWA0 with 100km Fiber cable between instances
1,700,000
1,700,000
1,600,000
1,600,000
1,500,000
1,500,000
1,400,000
1,400,000
1,300,000
1,300,000
1,200,000
1,200,000
1,100,000
1,100,000
1,000,000
1,000,000
900,000
900,000
800,000
800,000
700,000
700,000
600,000
600,000
500,000
500,000
400,000
400,000
300,000
300,000
200,000
200,000
100,000
100,000
12:25:00
(16-Apr-2004)
12:30:00
(16-Apr-2004)
12:35:00
(16-Apr-2004)
12:40:00
(16-Apr-2004)
12:45:00
(16-Apr-2004)
12:50:00
(16-Apr-2004)
b [NET.EWA0:]Bytes Recv/Sec(# 1)
c
d
e
f
g
QBB0
12:55:00
(16-Apr-2004)
RAC DT/HA – T4 EIA compared w/ EWA
Bytes/sec of EIA NIC over UTP and EWA NIC over 100km
Red graph says [NET.EWA0], but this is really [NET.EIA0]
1,700,000
1,700,000
1,600,000
1,600,000
1,500,000
1,500,000
1,400,000
1,400,000
1,300,000
1,300,000
1,200,000
1,200,000
1,100,000
1,100,000
1,000,000
1,000,000
900,000
900,000
800,000
800,000
700,000
700,000
600,000
600,000
500,000
500,000
400,000
400,000
300,000
300,000
200,000
200,000
100,000
100,000
0
0
16:20:00
(26-Mar-2004)
16:30:00
(26-Mar-2004)
16:40:00
(26-Mar-2004)
b [NET.EWA0:]Bytes Recv/Sec(# 1) g
c
d
e
f
g
b [NET.EWA0:]Bytes Recv/Sec(# 2)
c
d
e
f
RAC DT/HA – Load Generation Data
50k Transactions, no RAC or Network Failure
Network Interface
Total duration
TPS
Baseline (EIA 161.114.69.x)
30:08
27.8
Lan Failover (EWA 10.3.3.x)
30:02
27.9
FailSafe IP (EWD 10.4.4.x)
30:02
27.9
100 km Baseline (EWA 10.3.3.x)
29:52
28.0
RAC DT/HA – Load Generation Data
50k Transactions, Network failover
Network Interface
Total duration
TPS
Baseline (EIA 161.114.69.x)
N/A
N/A
Lan Failover (EWA 10.3.3.x)
30:02
27.9
FailSafe IP (EWD 10.4.4.x)
30:13
27.7
N/A
N/A
100 km Baseline (EWA 10.3.3.x)
RAC DT/HA – Load Generation Data
50k Transactions, 1 RAC instance failed
Network Interface
Total
Duration
TPS
50client
Failover
Baseline (EIA 161.114.69.x)
33:25
25.0
00:37
Lan Failover (EWA 10.3.3.x)
29:54
28.0
00:39
FailSafe IP (EWD 10.4.4.x)
30:02
27.7
00:39
100 km Baseline (EWA 10.3.3.x)
29:39
28.0
00:43
RAC DT/HA – Conclusions
 RAC
seemed to have no problems when running
with network configured to use LAN Failover or
failSAFE IP (on the same node).
 There
seems to be a definite distributing effect
on network traffic when Oracle init.ora
parameter CLUSTER_INTERCONNECTS is
used
RAC DT/HA – Phase II and III
 Phase
II: Configure Oracle 9iRAC 2-node cluster
using Raid-1 Shadow Sets for database and
logfiles, and test recently released Host Based
Mini-Merge (HBMM) functionality in a variety of
configurations.
 Refer
to:
http://h71000.www7.hp.com/news/hbmm.htm
 Phase
III: Distribute nodes in cluster over
100km+ distance and test failover and HBMM
functionality
RAC DT/HA - References
OpenVMS Technical Journal:
 Matt
Muggeridge’s July 2003 - V2 Article:
Configuring TCP/IP for High Availability
http://h71000.www7.hp.com/openvms/journal/v2/
articles/tcpip.pdf
 Steve
Lieman’s January 2004 - V3 Article:
TimeLine-Driven Collaboration with T4 &
Friends: A Time-saving Approach to OpenVMS
Performance
http://h71000.www7.hp.com/openvms/journal/v3/
t4.pdf
RAC DT/HA – References (con’t)
 TCPIP
docs:
http://h71000.www7.hp.com/doc/tcpip54.html
 OpenVMS
docs:
http://h71000.www7.hp.com/doc/os732_index.ht
ml
 HP
o
TCP/IP Services for OpenVMS Management:
Chapter 5 Configuring and Managing FailSAFE
IP
http://h71000.www7.hp.com/doc/732final/docum
entation/pdf/aa-lu50m-te.pdf
RAC DT/HA – References (con’t)
 HP
OpenVMS System Management Utilities
Reference Manual: Chapter 13, LAN Control
Program (LANCP) Utility
o
http://h71000.www7.hp.com/doc/732FINAL/DOC
UMENTATION/PDF/aa-pv5ph-tk.PDF
 HP
OpenVMS System Manager’s Manual,
Volume 2 -Tuning, Monitoring, and Complex
Systems: Chapter 10, Managing the Local Area
Network (LAN)Software
o
http://h71000.www7.hp.com/doc/732FINAL/aapv5nh-tk/aa-pv5nh-tk.pdf
RAC DT/HA – References (con’t)
Oracle References:
– an ‘unofficial’ load generating
benchmarking tool, developed in Java, which
allows a load to be generated and the
transactions/response times to be charted
 Swingbench
 http://www.dominicgiles.com/swingbench.php
 OTN
otn.oracle.com Real 24/7: Use Oracle9i
RAC and TAF to guarantee availability.
http://otn.oracle.com/oramag/oracle/02may/o32clusters.html
RAC DT/HA – References (con’t)
Oracle Metalink articles: metalink.oracle.com.
Note:183340.1 - Frequently Asked Questions About the.
 CLUSTER_INTERCONNECTS Parameter in 9i.


Note 220970.1 - “Which network is Oracle using for RAC
traffic?"

Note: 162725.1 - OPS/RAC VMS: Using alternate TCP
Interconnects on 8i OPS and 9i RAC on OpenVMS.

Note: 226880.1 – Configuration of Load Balancing and
Transparent Application Failover.
OpenVMS Solutions Lab
 Available
to customers to test new hardware,
software, applications
 Alpha
 To
and Integrity systems available for use
get the most benefit from the Lab, customer is
expected to be prepared with exact list of
hardware and software requirements, test plan
and goals
Related documents