Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
OpenVMS Solutions Center Lab Project - Spring 2004 : Oracle 9i RAC DT/HA in a distributed OpenVMS Environment Phase I – Failover RAC DT/HA – Goals – Phase I First: Demonstrate that Oracle 9iRAC continues to run during simulated network failure using LAN Failover and failSAFE IP configurations. Second: Measure the latency effect of failover when RAC instances are connected over long distance (100km). RAC DT/HA – What is Failover? Oracle RAC failover: The ability to resume work on an alternate instance upon instance failure Oracle TAF (Transparent Application Failover): Runtime failover which enables client applications to automatically reconnect to the database if the connection fails LAN Failover: Hardware failover from failed network interface card (NIC) to another NIC configured as part of LAN failover set failSAFE IP: Address failover to alternate interfaces RAC DT/HA – Hardware Config 2 4-cpu GS160, with Shared Cluster System disk, a Shared Oracle install disk on Enterprise Storage Array connected via Fibre SAN A Switch DE602-AA (EIA) NIC’s, using Twisted Pair on 100m-bit LAN Extreme Summit4 Switch 5-DEGPA-SA, 1-DEGXA-SA (EWA-D) NIC’s, 1Gbit fiber on 1Gbit LAN Digital Networks DNSwitch 800 100km cable - Gbit SCS Extreme Summit 7i Switch RAC DT/HA – Server Config OpenVMS 7.3-2, TCPIP 5.4 Oracle Server 9.2.0.4, with Oracle patch for bug fix 3026720: Excessive CPU and BUFIO for LMD0 and SMON processes when >2cpu Running 2 RAC instances, in 2 node cluster Requires the INIT<SID>.ORA parameter CLUSTER_INTERCONNECTS to specify alternate network interface for RAC communication RAC DT/HA – Client Config 9.2 SQLNet Client, on PC running Windows 2000 Benchmark/Load • • Generating software: Swingbench 2.1f- An ‘unofficial’, Java based, client load generating tool from Oracle, which allows a ‘load’ to be generated and the transactions/response times to be charted Configured to connect 100 clients, load balanced between the 2 instances, and run 50,000 ‘typical’ Order Entry transactions RAC DT/HA – Test Plan Restore from disk backup before each test run to ensure same starting point Ensure RAC instances communicating over specified network interface Run 3 iterations of same benchmark load while collecting data • Run Benchmark load, no failures • Run Benchmark load, fail instance • Run Benchmark load, fail network connection between instances RAC DT/HA – Data collection T4 running on both nodes, 10sec sampling interval Saved Swingbench data results after each run Executed and ‘saved’ output of VMS commands during network failures to see status of network devices and Oracle processes $ MC LANCP SHOW DEVICE/CHARATERISTICS LLA0 $ TCPIP SHOW INTERFACES/FULL $ PIPE SHO SYS|SEA TT: ORA_CPU Tabular Timeline Tracking Tool – T4 Created by OpenVMS Sustaining Engineers to help diagnose OS functionality. Uses OpenVMS Monitor data, stored in Comma Separated Value file format (.csv file), which can then be used by a variety of applications (spreadsheets, TlViz, etc) Download from web. Shipped with OpenVMS 7.3-2, in SYS$ETC directory http://h71000.www7.hp.com/openvms/products/t4/index.html Users are able to queue data collection and configure data collection frequency Helpful in establishing baseline performance footprint which can then be used in before and after comparisons of system changes T4 ‘hooks’ for Oracle and Rdb Server being created RAC DT/HA – EIA Network GS160 - QBB0 Oracle RAC network connection using EIA device EIA0 161.114.69.7 EVA Common System Disk 100 M-bit Lan Extreme Summit 4 Switch Shared Oracle 9i Fiber San A Switch UTP Ethernet PC Swingbench Client 1 PC Swingbench Client 2 . . . Database EIA0 161.114.69.8 GS160 - QBB3 PC Swingbench Client 100 RAC DT/HA – T4 data - EIA EIA0 - Baseline 1,400,000 1,400,000 1,300,000 1,300,000 1,200,000 1,200,000 1,100,000 1,100,000 1,000,000 1,000,000 900,000 900,000 800,000 800,000 700,000 700,000 600,000 600,000 500,000 500,000 400,000 400,000 300,000 300,000 200,000 200,000 100,000 100,000 16:20:00 (26-Mar-2004) 16:25:00 (26-Mar-2004) 16:30:00 (26-Mar-2004) 16:35:00 (26-Mar-2004) 16:40:00 (26-Mar-2004) b [NET.EIA0:]Bytes Recv/Sec(# 1) c d e f g Node QBB0 16:45:00 (26-Mar-2004) RAC DT/HA - LAN Failover Network GS160 - QBB0 Oracle RAC connection using LLA0 device for LAN Failover EIA0 161.114.69.7 EWA0 EVA EWB0 LLA0 10.3.3.1 Shared Oracle 9i Fiber San A Switch G-Bit LAN Digital Networks DNswitch 800 100 M-bit Lan Extreme Summit 4 Switch LLA0 10.3.3.2 EWA0 Database UTP Ethernet Common System Disk PC Swingbench Client 1 PC Swingbench Client 2 . . . EWB0 PC Swingbench Client 100 EIA0 161.114.69.8 GS160 - QBB3 RAC DT/HA – LAN Failover DCL $ MCR LANCP SHOW DEVICE/CHAR LLA0 Before NIC ‘fails’ Device Characteristics LLA0: Value -----256 Yes . . 1000 "EWB0" "EWA0" . . 0 Characteristic -------------Max receive buffers Full duplex enable . . Line speed (mbps) Failover device Failover device (active) . . Failover priority After NIC ‘fails’ Device Characteristics LLA0: Value -----256 Yes . . 1000 "EWB0" "EWA0" . . 0 Characteristic -------------Max receive buffers Full duplex enable . . Line speed (mbps) Failover device (active) Failover device . . Failover priority RAC DT/HA-T4 LAN Failover EWA/B LAN Failover - Pull Cable 1,900,000 1,900,000 1,800,000 1,800,000 1,700,000 1,700,000 1,600,000 1,600,000 1,500,000 1,400,000 1,300,000 1,200,000 EWA0 cable pulled EWB0 cable pulled 1,500,000 1,400,000 1,300,000 1,200,000 1,100,000 1,100,000 1,000,000 1,000,000 900,000 900,000 800,000 800,000 700,000 700,000 600,000 600,000 500,000 500,000 400,000 400,000 300,000 300,000 200,000 200,000 100,000 100,000 0 0 17:05:00 (7-Apr-2004) 17:10:00 (7-Apr-2004) 17:15:00 (7-Apr-2004) 17:20:00 (7-Apr-2004) 17:25:00 (7-Apr-2004) 17:30:00 (7-Apr-2004) b [NET.EWA0:]Bytes Recv/Sec(# 1) g c d e f g b [NET.EWB0:]Bytes Recv/Sec(# 1) c d e f Node QBB0 RAC DT/HA-T4 LAN Failover LLA0 LAN Failover - Pull Cable 1,900,000 1,900,000 1,800,000 1,800,000 1,700,000 1,700,000 1,600,000 1,600,000 1,500,000 1,500,000 1,400,000 1,400,000 1,300,000 1,300,000 1,200,000 1,200,000 1,100,000 1,100,000 1,000,000 1,000,000 900,000 900,000 800,000 800,000 700,000 700,000 600,000 600,000 500,000 500,000 400,000 400,000 300,000 300,000 200,000 200,000 100,000 100,000 17:05:00 (7-Apr-2004) 17:10:00 (7-Apr-2004) 17:15:00 (7-Apr-2004) 17:20:00 (7-Apr-2004) 17:25:00 (7-Apr-2004) b [NET.LLA0:]Bytes Recv/Sec(# 1) c d e f g Node QBB0 17:30:00 (7-Apr-2004) RAC DT/HA-T4 Overlay of EWA/LLA0 LAN Failover - Pull Cable 1,900,000 1,900,000 1,800,000 1,800,000 1,700,000 1,700,000 1,600,000 1,600,000 1,500,000 1,500,000 1,400,000 1,400,000 1,300,000 1,300,000 1,200,000 1,200,000 1,100,000 1,100,000 1,000,000 1,000,000 900,000 900,000 800,000 800,000 700,000 700,000 600,000 600,000 500,000 500,000 400,000 400,000 300,000 300,000 200,000 200,000 100,000 100,000 0 0 17:05:00 (7-Apr-2004) 17:10:00 (7-Apr-2004) 17:15:00 (7-Apr-2004) 17:20:00 (7-Apr-2004) 17:25:00 (7-Apr-2004) 17:30:00 (7-Apr-2004) b [NET.EWA0:]Bytes Recv/Sec(# 1) g c d e f g b [NET.EWB0:]Bytes Recv/Sec(# 1) g c d e f b [NET.LLA0:]Bytes Recv/Sec(# 1) c d e f Node QBB0 RAC DT/HA – failSAFE IP Network GS160 - QBB0 Oracle RAC connection using EWD0/E0 devices FailSafeIP EIA0 161.114.69.7 EWA0 EVA EWB0 PC Swingbench Client 1 10.4.4.1 Shared Oracle 9i Fiber San A Switch G-Bit LAN Digital Networks DNswitch 800 UTP Ethernet Common System Disk 100 M-bit Lan Extreme Summit 4 Switch EWD0 10.4.4.2 Database EWE0 10.4.4.3 . . . 10.4.4.2 & 10.4.4.3 are configured for FailSafeIP EIA0 161.114.69.8 GS160 - QBB3 PC Swingbench Client 2 PC Swingbench Client 100 RAC DT/HA – failSAFE IP DCL $ TCPIP SHOW INTERFACE/FULL Route Tree for Protocol Family 2: default 161.114.69.1 10.4.4/24 10.4.4.2 10.4.4/24 10.4.4.3 10.4.4.2 10.4.4.2 10.4.4.3 10.4.4.3 UGS U U UHL UHL 0 274 274 0 0 7999 408185 445714 0 14 IE0 WE3 WE4 WE3 WE4 WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> failSAFE IP Addresses: inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE4) *inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500 WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> failSAFE IP Addresses: inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE3) *inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500 RAC DT/HA – failSAFE IP DCL Failed 1 $ TCPIP SHOW INTERFACE/FULL Route Tree for Protocol Family 2: default 161.114.69.1 10.4.4/24 10.4.4.2 10.4.4/24 10.4.4.3 10.4.4.2 10.4.4.2 10.4.4.3 10.4.4.3 UGS U U UHL UHL 0 274 274 0 0 7999 408185 445714 0 14 IE0 WE3 WE4 WE3 WE4 WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> *failSAFE IP - interface is in a failed state failSAFE IP Addresses: inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE4) *inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 (on QBB3 WE4) WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> *inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500 inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63 ipmtu 1500 RAC DT/HA – failSAFE IP DCL Failed 2 $ TCPIP SHOW INTERFACE/FULL Route Tree for Protocol Family 2: default 161.114.69.1 10.4.4/24 10.4.4.2 10.4.4/24 10.4.4.3 10.4.4.2 10.4.4.2 10.4.4.3 10.4.4.3 UGS U U UHL UHL 0 274 274 0 0 7999 408185 445714 0 14 IE0 WE3 WE4 WE3 WE4 WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> *inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500 inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 ipmtu 1500 WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> *failSAFE IP - interface is in a failed state. failSAFE IP Addresses: inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63(on QBB3 WE3) *inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 (on QBB3 WE3) RAC DT/HA – T4 data failSAFE IP FailSafeIP - Pull Cable 1,600,000 1,600,000 1,500,000 1,500,000 1,400,000 1,400,000 EWE0 cable pulled EWD0 cable pulled 1,300,000 1,200,000 1,100,000 1,000,000 1,300,000 1,200,000 1,100,000 1,000,000 900,000 900,000 800,000 800,000 700,000 700,000 600,000 600,000 500,000 500,000 400,000 400,000 300,000 300,000 200,000 200,000 100,000 100,000 0 0 14:40:00 (14-Apr-2004) 14:45:00 (14-Apr-2004) 14:50:00 (14-Apr-2004) 14:55:00 (14-Apr-2004) 15:00:00 (14-Apr-2004) 15:05:00 (14-Apr-2004) b [NET.EWD0:]Bytes Recv/Sec(# 1) g c d e f g b [NET.EWE0:]Bytes Recv/Sec(# 1) c d e f Node QBB3 RAC DT/HA – 100km cable Network Oracle RAC connection using EWA0 device separated by 100km GS160 - QBB0 EIA0 161.114.69.7 EWA0 EVA PC Swingbench Client 1 Common System Disk 100 M-bit Lan Extreme Summit 4 Switch 100km Fiber Cable Shared Oracle 9i Fiber San A Switch G-Bit SCS Extreme Summit 7i UTP Ethernet G-Bit SCS Extreme Summit 7i PC Swingbench Client 2 . . . Database EWA0 PC Swingbench Client 100 EIA0 161.114.69.8 GS160 - QBB3 RAC DT/HA – T4 EWA0 w/100km cable EWA0 with 100km Fiber cable between instances 1,700,000 1,700,000 1,600,000 1,600,000 1,500,000 1,500,000 1,400,000 1,400,000 1,300,000 1,300,000 1,200,000 1,200,000 1,100,000 1,100,000 1,000,000 1,000,000 900,000 900,000 800,000 800,000 700,000 700,000 600,000 600,000 500,000 500,000 400,000 400,000 300,000 300,000 200,000 200,000 100,000 100,000 12:25:00 (16-Apr-2004) 12:30:00 (16-Apr-2004) 12:35:00 (16-Apr-2004) 12:40:00 (16-Apr-2004) 12:45:00 (16-Apr-2004) 12:50:00 (16-Apr-2004) b [NET.EWA0:]Bytes Recv/Sec(# 1) c d e f g QBB0 12:55:00 (16-Apr-2004) RAC DT/HA – T4 EIA compared w/ EWA Bytes/sec of EIA NIC over UTP and EWA NIC over 100km Red graph says [NET.EWA0], but this is really [NET.EIA0] 1,700,000 1,700,000 1,600,000 1,600,000 1,500,000 1,500,000 1,400,000 1,400,000 1,300,000 1,300,000 1,200,000 1,200,000 1,100,000 1,100,000 1,000,000 1,000,000 900,000 900,000 800,000 800,000 700,000 700,000 600,000 600,000 500,000 500,000 400,000 400,000 300,000 300,000 200,000 200,000 100,000 100,000 0 0 16:20:00 (26-Mar-2004) 16:30:00 (26-Mar-2004) 16:40:00 (26-Mar-2004) b [NET.EWA0:]Bytes Recv/Sec(# 1) g c d e f g b [NET.EWA0:]Bytes Recv/Sec(# 2) c d e f RAC DT/HA – Load Generation Data 50k Transactions, no RAC or Network Failure Network Interface Total duration TPS Baseline (EIA 161.114.69.x) 30:08 27.8 Lan Failover (EWA 10.3.3.x) 30:02 27.9 FailSafe IP (EWD 10.4.4.x) 30:02 27.9 100 km Baseline (EWA 10.3.3.x) 29:52 28.0 RAC DT/HA – Load Generation Data 50k Transactions, Network failover Network Interface Total duration TPS Baseline (EIA 161.114.69.x) N/A N/A Lan Failover (EWA 10.3.3.x) 30:02 27.9 FailSafe IP (EWD 10.4.4.x) 30:13 27.7 N/A N/A 100 km Baseline (EWA 10.3.3.x) RAC DT/HA – Load Generation Data 50k Transactions, 1 RAC instance failed Network Interface Total Duration TPS 50client Failover Baseline (EIA 161.114.69.x) 33:25 25.0 00:37 Lan Failover (EWA 10.3.3.x) 29:54 28.0 00:39 FailSafe IP (EWD 10.4.4.x) 30:02 27.7 00:39 100 km Baseline (EWA 10.3.3.x) 29:39 28.0 00:43 RAC DT/HA – Conclusions RAC seemed to have no problems when running with network configured to use LAN Failover or failSAFE IP (on the same node). There seems to be a definite distributing effect on network traffic when Oracle init.ora parameter CLUSTER_INTERCONNECTS is used RAC DT/HA – Phase II and III Phase II: Configure Oracle 9iRAC 2-node cluster using Raid-1 Shadow Sets for database and logfiles, and test recently released Host Based Mini-Merge (HBMM) functionality in a variety of configurations. Refer to: http://h71000.www7.hp.com/news/hbmm.htm Phase III: Distribute nodes in cluster over 100km+ distance and test failover and HBMM functionality RAC DT/HA - References OpenVMS Technical Journal: Matt Muggeridge’s July 2003 - V2 Article: Configuring TCP/IP for High Availability http://h71000.www7.hp.com/openvms/journal/v2/ articles/tcpip.pdf Steve Lieman’s January 2004 - V3 Article: TimeLine-Driven Collaboration with T4 & Friends: A Time-saving Approach to OpenVMS Performance http://h71000.www7.hp.com/openvms/journal/v3/ t4.pdf RAC DT/HA – References (con’t) TCPIP docs: http://h71000.www7.hp.com/doc/tcpip54.html OpenVMS docs: http://h71000.www7.hp.com/doc/os732_index.ht ml HP o TCP/IP Services for OpenVMS Management: Chapter 5 Configuring and Managing FailSAFE IP http://h71000.www7.hp.com/doc/732final/docum entation/pdf/aa-lu50m-te.pdf RAC DT/HA – References (con’t) HP OpenVMS System Management Utilities Reference Manual: Chapter 13, LAN Control Program (LANCP) Utility o http://h71000.www7.hp.com/doc/732FINAL/DOC UMENTATION/PDF/aa-pv5ph-tk.PDF HP OpenVMS System Manager’s Manual, Volume 2 -Tuning, Monitoring, and Complex Systems: Chapter 10, Managing the Local Area Network (LAN)Software o http://h71000.www7.hp.com/doc/732FINAL/aapv5nh-tk/aa-pv5nh-tk.pdf RAC DT/HA – References (con’t) Oracle References: – an ‘unofficial’ load generating benchmarking tool, developed in Java, which allows a load to be generated and the transactions/response times to be charted Swingbench http://www.dominicgiles.com/swingbench.php OTN otn.oracle.com Real 24/7: Use Oracle9i RAC and TAF to guarantee availability. http://otn.oracle.com/oramag/oracle/02may/o32clusters.html RAC DT/HA – References (con’t) Oracle Metalink articles: metalink.oracle.com. Note:183340.1 - Frequently Asked Questions About the. CLUSTER_INTERCONNECTS Parameter in 9i. Note 220970.1 - “Which network is Oracle using for RAC traffic?" Note: 162725.1 - OPS/RAC VMS: Using alternate TCP Interconnects on 8i OPS and 9i RAC on OpenVMS. Note: 226880.1 – Configuration of Load Balancing and Transparent Application Failover. OpenVMS Solutions Lab Available to customers to test new hardware, software, applications Alpha To and Integrity systems available for use get the most benefit from the Lab, customer is expected to be prepared with exact list of hardware and software requirements, test plan and goals