Download Failover and Recovery in WebSphere Application Server Advanced Edition 4.0

Document related concepts

Relational model wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft Access wikipedia , lookup

Database wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Database model wikipedia , lookup

Btrieve wikipedia , lookup

Clusterpoint wikipedia , lookup

Object-relational impedance mismatch wikipedia , lookup

Team Foundation Server wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Transcript
Failover and Recovery in WebSphere
Application Server Advanced Edition 4.0
By:
Tom Alcott
Michael Cheng
Sharad Cocasse
David Draeger
Melissa Modjeski
Hao Wang
Revision Date:
December 18, 2001
Table of Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
About the Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Chapter 1 - WebSphere Topologies for Failover and Recovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1 Introduction to High Availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.1 The Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.2 The 5 Nines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.3 Considering the Context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.4 Types of Availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.5 Fundamental Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.5.1 Hardware Clustering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.5.2 Software Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1.5.3 Advantages and Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.5.4 WebSphere Application Server clustering. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.6 Hardware Availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.7 Planned Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.8 Disaster Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.1.9 Evaluating your HA Solutions.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2 Failover with WebSphere Application Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.1 WebSphere Servlet Requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.2 EJB Requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.3 Administrative Server High Availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.4 Discussions and Problem Spots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 Failover Topologies with WebSphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.1 HTTP Requests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3.2 Database Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.3 LDAP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.3.4 Firewall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4 Suggested Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Chapter 2 - HTTP Server Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Chapter 3 - Web Container Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1 WebSphere Server Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 The HTTP Server Plug-in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Web Container Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.4 Session affinity and session persistence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Chapter 4 - EJB Container Failover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.1 The WLM plug-in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2 EJB Container failover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 WLM and EJB Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.4 WLM and EJB Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Chapter 5 - Administrative Server Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 Runtime Support in the Administrative Server
.........................................
5.2.1 Types of clients that use the admin server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1.1 Application server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.1.2 EJB clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.2 Enabling High Availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.2.1 JNDI Client Failover Behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.2.2 Location Service Daemon (LSD) Behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.2.3 Security Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.2.4 Transaction logging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2.2.5 Starting an application server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 System Administration Support in the Administrative Server
..............................
5.3.1 Types of System Administration Clients.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1.1 Administrative Console (admin console). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1.2 XMLConfig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.1.3 WSCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3.2 Known limitations of System Administration Failover
..............................
5.4 Configuration parameters which affect admin server failover
..............................
Chapter 6 - WebSphere Database Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.1 – Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2 – Application Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.1 StaleConnectionException. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.2.1.1 Connections in auto-commit mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3 Session Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.1 Expected Behavior - Servlet Service Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.2 Expected Behavior - Manual Update. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.3.3 Expected Behavior - Time Based Update. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4 Administrative Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.1 System Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.2 Application Runtime Environment - JNDI caching
.................................
6.4.2.1 -XML/DTD syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.2.2 Configuring the JNDI cache for an application server
..........................
6.4.2.3 Configuring the JNDI Cache for an Application Client
.........................
6.4.2.4 Preloading JNDI Cache for Thin Client. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.2.5 Operating Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.2.6: JNDI Cache Size Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.3 HA Administrative Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6.4.4 Administrative Database Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 7 - Integrated Hardware Clustering for Database High Availability
..................
7.1 HACMP and HACMP/ES on IBM AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.1 Introduction and Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.2 Expected Reactions to Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.3 WebSphere HACMP configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.4 Tuning heartbeat and cluster parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2 MC/Serviceguard on the HP-Unix Operating System.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
39
39
40
40
40
42
43
43
44
44
44
44
44
45
45
45
47
47
48
48
49
53
54
54
55
56
57
58
59
60
60
60
60
61
61
62
64
64
64
65
67
68
68
7.2.1 Introduction and Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2.2 Expected Reactions to Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.2.3 WebSphere MC/ServiceGuard configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.2.4 Tuning heartbeat and cluster parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3 Microsoft Clustered SQL Server on Windows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3.1 Introduction and Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3.2 Expected Reactions to Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.3.4 Tuning heartbeat and cluster parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Chapter 8 - Failover for Other Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.1 Firewalls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.2 LDAP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Appendix A Installing Network Dispatcher (ND) on Windows NT. . . . . . . . . . . . . . . . . . . . . . . . . . 84
Appendix B - Configuring TCP Timeout parameters by OS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Appendix C - MC/ServiceGuard setup instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Appendix D HACMP setup instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Appendix E - Microsoft Clustering Setup Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Preface
This document discusses various options for making an IBM WebSphere Application Server V4.x
environment Highly Available (HA). It is meant for WebSphere architects and advanced WebSphere users
who wish to make their WebSphere environments highly available. We will discuss recommended failover
topologies as well as their implementation details. In this document, we suggest a 3-pronged approach to
WebSphere failover:
1. Using WebSphere’s built-in failover capability
Failover is achieved using the Work Load Management (WLM) facility at the Web Module, EJB Module,
and Administrative Server levels. These concepts are discussed in detail in Chapters 3, 4 and 5.
2. Application code best practices for failure related exception handling and recovery
In the case of database failover, application code should be able to handle JDBC exceptions and roll-over
to the “healthy” database server. This is achieved by ensuring that the application code is “failover-ready”.
We discuss this and other database failover concepts in chapter 6.
3. Using other IBM and third party products in conjunction with WebSphere
There are a wide variety of IBM and third party products that can be used to enhance the failover
capability in WebSphere and the list continues to grow. These products include clustering software for the
various operating systems, HA databases and HA web server clusters. These are discussed in chapters 2,
7, and 8.
Failover and recovery in WebSphere is an extremely broad subject. In this document we have made an
attempt to provide hands-on instructions for some of the tasks and have referenced on-line resources such as
IBM RedBooks and the WebSphere InfoCenter for other tasks.
About the Authors
This document was produced by a team of WebSphere specialists based in the United States.
Tom Alcott is an advisory I/T specialist in the United States. He has been a member of the World Wide
WebSphere Technical Sales Support team since its inception. In this role, he spends most of his time trying to
stay one page ahead of customers in the manual. Before he started working with WebSphere, he was a
systems engineer for IBM’s Transarc Lab supporting TXSeries. His background includes over 20 years of
application design and development on both mainframe-based and distributed systems. He has written and
presented extensively on a number of WebSphere runtime and security issues.
Michael Cheng is a member of the WebSphere Architecture Board. He is currently working on the high
availability architecture of the WebSphere Application Server. Previously, he had been a significant
contributor to IBM's ORB technology, and worked on supporting EJBs in IBM Component Broker. He
received a Ph.D. in computer sciences from the University of Wisconsin-Madison.
Sharad Cocasse is a Staff Software Engineer on the WebSphere Execution Team. His experience includes
several years of writing object-oriented applications and developing/delivering WebSphere education to
Customers and IBM field technical personnel. Most recently, he has been directly helping Customers to be
successful with their use of the WebSphere Application Server. Sharad holds a Master Degree in Electrical
Engineering from the University of Alabama and lives and works in Rochester Minnesota.
David Draeger is a WebSphere Software Engineer and member of the WebSphere Execution Team out of
Rochester, MN. He has worked on-site with multiple customers during critical situations to resolve issues with
WebSphere failover and other topics. Dave has also authored many white papers dealing with WebSphere
Application Server. He has helped in the development of tools for WebSphere and has driven new
innovations into the WebSphere Application Server software. Dave received a BS in Computer Science from
the engineering college of the University of Illinois at Urbana/Champaign.
Melissa Modjeski is a member of the WebSphere Execution team, working closely with customers
designing and implementing WebSphere environments and applications. She has also been involved in
developing WebSphere 3.0, 3.5, and 4.0 education for IBM field members and WebSphere customers.
Melissa received a Bachelor of Science degree in Computer Science and Mathematics from Winona State
University.
Hao Wang is a member of WebSphere development team. He has been working on the San Francisco
project, WebSphere Solutions Integration, and WebSphere Connection Management. His background
includes Ph.D. degree in Computer Science from Iowa State University. Before he joined IBM in January
1999, he worked for the university as an associate professor and scientist for more than 10 years, instructed
graduate-level courses such as the principles of database systems, and conducted R&D on high-performance
distributed and parallel computing, cluster computing, and computer simulation models, meanwhile he also
worked as IT consultant.
Thanks to the following people for their invaluable contributions to this project:
Tony Arcuri
Keys Botzum
Dave Cai
Utpal Dave
John Koehler
Makarand Kulkarni
Keith McGuinnes
Kevin Zemanek
7
Chapter 1 - WebSphere Topologies for Failover and Recovery
1.1 Introduction to High Availability
1.1.1 The Basics
Before discussing the considerations for providing a highly available WebSphere Application Server
implementation, it’s important to provide some background on the subject of high availability in order to ensure
a common understanding of various terms that will be used later in this document.
To make a system “Highly Available” is to design a system or system infrastructure so as to eliminate or
minimize the loss of service due to either unplanned or planned outages. While for purposes of this paper we’ll
be referring to Highly Available Computer Systems, this term can be applied to a number of systems such as
the local water utility or phone system.
The meaning of the term “Highly Available” varies based on the service provided and the expected hours of
operation. For example, a system used by the local dry cleaner need only be available during normal business
hours 5 to 6 days a week (typically 10 to 12 hours/day), while a system used to host an airline reservation
system is expected to be available 7x24 (or nearly so). This brings us to our next term, “Continuous
Availability”, which is equated with nonstop service. While Continuous Availability is certainly a laudable goal
for some systems/services, in practice it is much like perfection, in other words “an absolute that is never
achieved”. Moreover, it’s important to realize that “High Availability” does not equal “Continuous
Availability”.
So if “High Availability” does not equal “Continuous Availability”, what does it mean? For purposes of
discussion in this paper we’ll adopt a definition of “providing a computer service that satisfies a defined service
level”. What’s a “service level?” Well, as alluded to above in the examples of the dry cleaner and the airline
reservation system, it’s providing system availability as appropriate for business. This includes provisions for
both planned and unplanned outages, but does not strive to exceed the business needs. By way of example, in
a past employment it was necessary for one of the authors to provide a system that satisfied a service level of
“6x24 and 1 x20, 99.95 % of the time”, which meant that the system was expected to be available Monday
through Saturday, 24 hours a day and on Sunday until 8:00 p.m., at which point 4 hours were allotted for
system maintenance. The requirement for “99.95%” availability meant that the system was required to be
available for 163.92 hours out of the 164 hours specified, meaning that all unplanned outages could total no
more than 5 minutes a week. This is certainly a reasonable and cost-effective goal for most enterprises. A
higher availability requirement tends to incur increasingly higher costs for the hardware and system
infrastructure.
1.1.2 The 5 Nines
This leads to the next term “5 nines”, which refers to making a system available 99.999% of the time. This is
generally the “gold standard” for high availability and is offered by most of the leading hardware manufacturers.
Examples of products that claim this type of availability are:
8
Ÿ
•
•
•
•
IBM with High Availability Cluster Multiprocessing (HACMP ™) on AIX ™
Sun with Sun™ Cluster on Solaris ™
Hewlett Packard with MultiComputer/ServiceGuard (MC/ServiceGuard) on HP-UX ™
Microsoft with Windows 2000 and Windows NT Clustering
Veritas Cluster Server ™
The availability offered by “5 nines” as compared to other service levels can be seen in the table below. As
can be seen, “5 nines” provides for a system that is available for all but 5 minutes per year.
Hours Per Year
Uptime & Downtime for a 24x7x365 System
8,800.00
4.38
8,750.00
8,700.00
8,650.00
0.09
0
43.8
87.6
8,672.40
8,716.20
8,755.62
8,759.91
8,760
99.95%
99.999%
100.0%
Hours Unavail
Hours Avail
8,600.00
99.0%
99.5%
Uptime
1.1.3 Considering the Context
When looking at measurements of system availability, it is important to consider the context of that
measurement. For example, a “5 nines” operating system/hardware combination does not mean that your
systems will now only have 5 minutes of downtime in a year. Instead, this means that the operating system and
associated hardware can achieve this. An entire system must take into account the network, software failures,
human error, and a myriad of other factors. Thus, speaking of availability without a context does not
necessarily have enough meaning.
In order to achieve a true highly-available system and accurately measure its availability, one must consider the
entire system and all of the components. Then, the business constraints and goals must be considered. Only
then can one meaningfully speak of availability. Issues to consider when designing for high availability are:
Ÿ
Ÿ
Ÿ
Ÿ
Ÿ
Ÿ
Ÿ
Disk failure
CPU or machine failure
Process crash/failure
Network failure
Catastrophic failure, e.g., the loss of an entire data center
Human error
Planned hardware and software maintenance and upgrades
9
For example, a fault tolerant disk array will not normally address the loss of an entire data center. For some
businesses, these issues must be considered. As the saying goes, a system is only as strong as its weakest link.
Of course any business must weigh the cost of a system outage versus the cost of any high availability
implementation. The incremental cost of providing a highly available architecture increases rapidly once an
organization reaches availability of 99.9% or more.
1.1.4 Types of Availability
When thinking about availability, two aspects must be considered: process availability and data availability.
Process availability is simply the state where processes exist that can process requests. Data availability occurs
when data is preserved across process failures and is available for processes that continue to be available. In
many systems, data availability is crucial. For example, it is of little value for a banking system to remain
available if it can not access account data.
Data availability can be further broken down by the general types of data:
• static data – binaries, install images, etc.
• rarely changing data – configuration information, new document versions, passwords, etc.
• active data.- data that is rapidly changing. This type of data usually represents the essence of the
system. For a banking system this would include account data.
Each of these data types require different types of actions to maintain their availability. For example, static data
can usually just be installed at initial application load and perhaps copied to replica machines. Rarely changing
data can often be manually updated on each replica. Of course, in either case, manual processes can be
automated to reduce human error, but the key point is that data rarely changes, so maintaining availability is not
difficult. For active data, the problem is much harder. Significant efforts must be undertaken to maintain the
availability of active data. Common techniques include fault tolerant disk arrays and automated replication.
1.1.5 Fundamental Techniques
For purposes of this document we’ll define highly available systems as relying on the following technologies:
Ÿ
Ÿ
Hardware Clustering
Software Clustering
NOTE: This initial discussion is intended solely as a brief overview of the
technologies used for high availability. It is neither a recommendation nor a
statement of support.
10
1.1.5.1 Hardware Clustering
In hardware clustering HA systems, the processes/software to be made highly available are configured to run
on one or more server and in the case of a failure are moved from one server to another. Examples of
hardware based HA include:
•
•
•
•
•
HACMP
Sun Cluster
MC/Serviceguard
Veritas Cluster Server
MS Clustering
The noted products provide a mechanism for clustering of software applications across multiple machines, thus
eliminating any given server as a Single Point of Failure (SPOF).
In general hardware clustering products provide a cluster manager process that periodically polls (also known
as “checking the heartbeat”) the other software processes in the cluster to determine if the software, and
hardware that it’s running on, is still active. If a “heartbeat” is not detected, then the cluster manager moves the
software process to another server in a cluster. While the movement of the process from one machine to
another is not instantaneous, it can be accomplished in fairly short order, typically from 30 seconds to a few
minutes, depending on the hardware and software employed. Hardware clustering HA systems typically run in
either an “active/active” mode of operation or an “active/standby” mode of operation.
Depicted below is an “active/active” configuration where all machines are active all the time running some
portion of the overall workload during normal operation. Once a failure is detected, the application or process
is moved from one server and the remaining server continues to run the entire workload.
Active/Active
Server 1
Mirrored Disks
Application 1
Server 2
Application 2
Before Failover
Server 1
Mirrored Disks
Server 2
Application 2
Application 1
After Failover
Figure 1.1: Active/Active failover configuration
11
In the above scenario, process availability is achieved by starting the appropriate application processes on the
replica machine. Data availability is achieved by using fault tolerant disk arrays. Both machines have access to
the same data.
Another alternative for a hardware clustering HA system is to utilize an “active/standby” configuration where
only one server is actively running the workload. A failure triggers a move of all processes or applications from
the active server to the standby server.
Active/Standby
Server 1
Mirrored Disks
Server 2
Application 1
Before Failover
Server 1
Mirrored Disks
Server 2
Application 1
After Failover
Figure 1.2: Active/Standby failover configuration
1.1.5.2 Software Clustering
Software clustering based HA systems most typically utilize some mechanism for data replication for the
purpose of providing database high availability. Examples of software based HA include:
•
•
IBM UDB/DB2 Data Replication
Oracle Parallel Server
With this type of technology the database manager is configured to replicate the database instance, so that all
database updates occur on two instances. In normal operation access to the database occurs via the primary
database instance. When a problem occurs in one of the instances, the remaining instance is used.
12
Database Replication
Server 1
Mirrored Disks
Server 2
Database Instance 2
(copy of Instance 1)
Database Instance 1
Before Failover
Server 1
Mirrored Disks
Server 2
Database Instance 2
(copy of Instance 1)
After Failover
Figure 1.3: Database Replication failover configuration
In this scenario, process availability is achieved by running active database processes on both machines
simultaneously. Presumably, the clients accessing the database have the ability to fail over to the backup
database server processes as needed. Data availability is achieved by replicating data (using database
techniques) from one database to the secondary continuously. Thus, the second database has a nearly current
copy of the system data at all times.
1.1.5.3 Advantages and Disadvantages
While hardware clustering based HA can be used to make a database, other applications, and processes
highly available, software clustering HA is usually limited to databases. Both approaches have advantages and
disadvantages. As noted, hardware clustering can be used for applications and processes other than the
database. Another advantage with hardware clustering is the light process overhead incurred by the cluster
manager process and the “heartbeat” mechanism used to determine hardware or software “health”. When
compared to hardware clustering, the ongoing data replication employed by software clustering exacts a higher
resource/performance cost. On the other hand, failover for software clustering will occur more rapidly than
with hardware clustering. The database processes are already active with the former, while with the latter the
process must be moved from one server to another and started. The amount of time required for failover will
vary based on the amount of time it takes to detect a failure, time for process startup scripts to execute, any
database transaction recovery, etc. As noted previously, failover can occur in roughly 30 seconds to a few
minutes, depending on the hardware and software in use.
1.1.5.4 WebSphere Application Server clustering
Conspicuous by its absence in the preceding discussion is the clustering technology employed by WebSphere
Application Server. WebSphere Application Server relies on “none of the above” to provide clustering
13
capability. Rather WebSphere Application Server provides inherent clustering capability via several
technologies as discussed in chapters 3, 4, and 5.
1.1.6 Hardware Availability
WebSphere Application Server is part of a much larger environment that includes the network, power supply
and other components required of a computing infrastructure.
Failure to consider all these additional components will negate any effort expended in deploying WebSphere
Application Server in a highly available environment.
While use of a HA product can eliminate a given machine as a single point of failure, it’s also important to
make provisions for redundant hardware on a given machine. Some examples of redundant hardware include:
•
•
•
Disk Controllers/Drives
Network Interface Cards
Power Supplies/Sources
1.1.7 Planned Maintenance
Much of the discussion so far has dealt with use of hardware and software clustering as a means of insuring
against an unexpected outage. Clustering can also be employed to allow for service or upgrades on machines
in a cluster while the remaining machines in the cluster continue to provide service. This is a very important
consideration. Many businesses overlook the very real need to upgrade hardware and software on a regular
basis. Upgrades are less risky if alternate systems can still provide services during the upgrades. Planned
Maintenance in WebSphere V4.x may involve the use of multiple domains and a special programming model.
These techniques are outside the scope of this white paper.
1.1.8 Disaster Recovery
Cluster technology eliminates single points of failure, a related but different topic is disaster recovery. Disaster
recovery refers to how the system recovers from catastrophic site failures. Examples of catastrophic failures
include earthquakes, tornados, hurricanes, floods, and fire. In simple terms, disaster recovery involves
replicating an entire site, not just pieces of hardware or sub-components. The service level requirements for
disaster recovery depends on the application. Some applications may not have any disaster recovery
requirement. Others may simply have backup data tapes from which they would rebuild a new working site
over a period of weeks or months. Still others may have requirements to begin operations 24 or 48 hours
after the disaster. The most stringent requirement is to keep the service running despite the disaster. Though
disaster recovery is a very important topic when considering application availability, it too is beyond the scope
of this paper.
1.1.9 Evaluating your HA Solutions
14
We have already discussed the concept of percentage of time available, such as "99.9% availability". This is
the number that is most often quoted to relate the probability of how long a system is expected to be available.
However, there are other factors that should be taken into consideration when putting together a highly
available system. These factors include recovery time, recovery point, cost, programming model, setup and
administration complexity, and special operating conditions and restrictions.
The recovery time is the duration before the application resumes useful processing again. For the purpose of
this paper, the recovery time includes the time to detect a failure and time to perform failover. As an example,
we've already mentioned that a database utilizing software replication may be able to failover faster than one
that uses hardware clustering. But failover time is not just limited to databases. All services being used by the
client, including the application and all the services that it uses, must be considered. Recovery time is related
to availability in that with a given probability of failure, the shorter the recovery time, the higher the percent
availability. Recovery time is also related to the type of failure. A cascaded failure that brings down many
components may take longer to recover than a single failure. Recovery time for algorithms based on time-outs
can often be tuned.
The recovery point refers to where useful processing resumes. It is directly related to the amount of work that
is lost during failover. During database failover, the amount of data lost may be limited to uncommitted
transactions. However, if asynchronous replication is used, some committed work may also be lost under
some error conditions. The same applies to any other part of the application, whether or not a database is
being used to store the data.
The cost factor is directly related to the first two factors. In general, the less stringent the availability, recovery
time, and recovery point requirements, the less the cost. A system offering 99.999% availability, with recovery
time in subseconds, and recovery point of almost no transaction lost during normal failure, can cost orders of
magnitude more than a system with 99.9% availability with recovery time in minutes and recovery point of
losing all pending transactions, and perhaps more. Taking disaster recovery into account, additional bandwidth
and backup sites may have to be set up, adding even more to the cost.
The programming model refers to what the application code needs to do during failover. Some services are
able to perform failover transparently. Since the failure is masked from the application, there is nothing special
that the application needs to do. Other services will require specific actions from the application. For example,
the application may get an error message during failover, with the requirement that the operation is retried after
the failover is completed. Since application development, testing, and maintenance can involve significant
costs, the simpler the programming model the better.
The setup and administration complexity should be considered when designing a HA system. Some solutions
may require additional third party prerequisites with their own setup and administration overhead. Others may
be easy to configure, with no special setup necessary.
15
Usually there are multiple solutions to address a given failure, with different tradeoffs. Some solutions may
have special restrictions, such as they only work for certain failures. Or they may work better for some
failures. Or perhaps they are unable to maintain consistent view of data under certain failure sequences. The
designer of an HA system needs to evaluate all the restrictions to ensure critical data is not impacted, and that
the there are ways to work around failure conditions not covered by a solution.
1.2 Failover with WebSphere Application Server
WebSphere Application Server is architected to provide for clustering. With prudent planning and deployment,
the WebSphere runtime (and your applications) can be made highly available. When planning for high
availability with WebSphere there are a number of WebSphere components as well as supporting software
components that must be taken into account. In brief these components are:
•
•
•
•
Network
HTTP Server
WebSphere Application Server (both web container and EJB container)
WebSphere Administration Server (Security, naming (bootstrap and name server), Location Service
Daemon, WLM, Serious Event Log, transaction log)
• Databases (application, administration, session)
• LDAP Server
• Firewall
These components are depicted below and will be discussed in more detail in subsequent chapters.
LDAP
HTTP
client
HTTP
Server
Internet
Firewall
Web
Container
Admin
Server
EJB
Container
Transaction
log
Java
client
WAS
DB
App and
Session
DB
Figure 1.4: End-to-End failover configuration
Within the WebSphere Application Server runtime there are several distinct technologies employed to provide
failover and work load management.
1.2.1 WebSphere Servlet Requests
16
WebSphere Application Server utilizes an HTTP Server plug-in to dispatch servlet requests from the HTTP
server to one or more Web/Servlet Containers running inside an application server. In addition to providing for
distribution of requests, the plug-in also provides for failover of requests once a web container is determined to
be unavailable or not responding to requests. The specifics of how this determination occurs are discussed in
chapter 3, but the diagram below depicts request distribution at a high level. The key point to recognize at this
point is that requests can be dispatched across 1 to N application servers residing on 1 to N physical servers,
thus eliminating a SPOF. These replica processes provide process availability. Data availability is an
application domain issue.
Servlet
Container
Servlet
Requests
HTTP
Server
App Server
Plug-in
HTTP(S) Protocol Traffic
Servlet
Container
App Server
Figure 1.5: WebSphere Servlet Request Mechanism
1.2.2 EJB Requests
In a similar vein, WebSphere Application Server also provides for distribution of requests for Enterprise Java
Beans. These requests can come from a standalone Java client, another EJB, or from a servlet (or JSP) as is
typically the case in web client architecture. Again the key point for purposes of this discussion is to recognize
that requests can be dispatched among 1 to N WebSphere Application Servers residing on multiple physical
machines, eliminating any SPOF. These replica processes provide process availability. Data availability is an
application domain issue. The specifics of the implementation of EJB failover are discussed further in chapter 4,
but is depicted at a high level below.
17
EJB Requests
EJB
Container
Servlet
Container
App Server
App Server
EJB
Container
Java
Client
IIOP Traffic
IIOP Traffic
App Server
Figure 1.6: WebSphere EJB Request Mechanism
These two technologies can be used singularly or combined to provide either vertical clustering or horizontal
clustering (these are sometimes referred to as vertical scalability and horizontal scalability).
•
Vertical clustering refers to the practice of defining multiple clones of an application server on the same
physical machine. In some cases a single application server, which is implemented by a single JVM
process, cannot always fully utilize the CPU power of a large machine and drive CPU load up to 100%.
Vertical cloning provides a straightforward mechanism to create multiple JVM processes, that together can
fully utilize all the processing power available as well as providing process level failover.
•
Horizontal clustering refers to the more traditional practice of defining clones of an application server on
multiple physical machines, thereby allowing a single WebSphere application to span several machines
while presenting a single system image. Horizontal cloning can provide both increased throughput and
failover.
18
Horizontal Scalability
Vertical Scalability
Application Server
Clone 1
Application Server
Clone 1
Application Server
Clone 2
Application Server
Clone 2
Node 1
Node 1
Servlet or EJB
Requests
Node 2
Servlet or EJB
Requests
Figure 1.7: Vertical and Horizontal Scalability
1.2.3 Administrative Server High Availability
In V4.x, administrative servers can participate in workload management. The WLM mechanism for the
administrative servers behaves in a similar manner as EJB requests. This is provided by virtue of the
WebSphere runtime's EJB and Java architecture. The workload managed administrative servers provide
identical processes across the cluster that can be used to satisfy administration requests, eliminating any SPOF.
This provides process availability, as discussed in chapter 5. However, one must keep in mind that the
WebSphere infrastructure relies on data in order to function. Chapters 6 and 7 will address administration
database availability.
A standalone Java client has a unique set of failover concerns. Whereas servlets, JSPs, and EJBs runs in an
application server, and have some knowledge of the other administrative servers in the WebSphere domain,
the standalone Java client relies on a bootstrap administrative server to provide this information. If the
bootstrap administrative server fails, the Java client must re-bootstrap to a different administrative server.
Additional information on this scenario is provided in Chapter 6.
1.2.4 Discussions and Problem Spots
By providing both vertical and horizontal scalability the WebSphere Application Server runtime architecture
eliminates a given application server process as a single point of failure. Horizontal scalability eliminates a single
19
node as a single point of failure. Vertical scalability can be added to improve process availability, or in some
cases may provide better utilization of resources on single node.
In fact the only single point of failure in the WebSphere runtime is the database server where the WebSphere
administrative repository resides. It is on the database server that a hardware-based high availability solution
such as HACMP or MC/ServiceGuard should be configured. One should have already invested in making the
application data highly available, and this investment should be leveraged for the administration and session
databases as well.
With WLM of administration servers, WebSphere bypasses many of the problems associated with failover
that could result from the failure of an administration server process on a given node. WebSphere does not
currently support running its application servers with takeover on hardware based HA platform, because there
is little to be gained from the additional investment in the infrastructure. One potential problem area arises when
WebSphere is serving as the coordinator of a two-phase commit transaction. This is discussed in more detail
in Chapter 5.
1.3 Failover Topologies with WebSphere
While the WebSphere Application Server provides a robust failover and load distribution capability within the
runtime, there are other components that need to be made highly available in order to provide a highly available
web site. As noted previously when discussing failover with WebSphere Application Server, it is important to
consider the entire production environment, not just the WebSphere Application Server itself. This includes
the HTTP server, the database servers, and the WebSphere Administration Server. Now that we have
discussed the fault tolerance mechanisms within the WebSphere runtime proper let’s return to a discussion of
failover topologies for the WebSphere environment.
1.3.1 HTTP Requests
The first component to consider is the HTTP Server. Some mechanism of providing for load
distribution/failover of incoming HTTP requests is required. Without such a mechanism the HTTP server itself
becomes a SPOF and scalability is limited by the size the hardware that can be used to host the HTTP server.
For this, an IP Sprayer such as WebSphere Edge Server (WES) Network Dispatcher needs to be employed.
Like WebSphere Application Server, the WES architecture provides for cluster support, eliminating the need
for third party hardware clustering product to provide cluster capability. Other HTTP load distribution
products may or may not provide inherent clustering support. Those products that do not provide clustering
support will require the use of a hardware based clustering product to eliminate this component as a SPOF.
The subjects of HTTP server failover and Network Dispatcher (eND) failover are discussed further in
chapters 2 and Appendix A. The diagram below depicts where eND fits into a highly available topology.
20
HTTP
Server
HTTP
Requests
Network
Dispatcher
(primary)
Plug-in
HTTP
Requests
HTTP
Server
TCP/IP Traffic
Network
Dispatcher
(failover)
Plug-in
Figure 1.8: HTTP Request flows
1.3.2 Database Server
The next layer of the topology that needs to be made highly available is the database server.
Without highly available application data, your web applications cannot provide dynamic and personalized data
to your customer. Order processing and transactions are not possible, in short; a web site could only serve
static content for the most part. It is at this layer in the topology that either the hardware or software clustering
technology as discussed above needs to be employed. Database failover and the specifics for handling
database failover with both the WebSphere runtime for the administration database, and within your
applications for application recovery will be discussed in more detail in chapter 6.
1.3.3 LDAP Server
A component that is sometimes overlooked when considering HA is the LDAP server. Much like the database
server, the LDAP server represents a SPOF unless some manner of replication or clustering is provided.
Some options for making the LDAP server highly available will be discussed in chapter 8.
1.3.4 Firewall
Firewalls are another major component of a WebSphere environment. Much like the network itself, failure of a
firewall can result in catastrophic consequences. HA options for firewalls will be discussed in chapter 8.
1.4 Suggested Topologies
The diagram below depicts the various layers in a WebSphere topology and the mechanisms employed to
provide high availability.
21
LDAP
Internet
Firewall
IP
Sprayer
(standby)
HTTP
WAS
WAS
WAS
App
App
App
Admin
Server
IP
Sprayer
HA with
Hardware
Clustering
HA with
WebSphere
Edge
Server
or
Hardware
Clustering
HA with
WebSphere
Edge
Server or
Hardware
Clustering
Database(s)
HTTP
HA with
HTTP
Server
Plug-in
WAS
WAS
WAS
App
App
App
Admin
Server
HA with
EJB
WLM
HA with
Hardware
Clustering
or
Software
Clustering
Figure 1.9: High Availability Mechanisms by Layer
This depicts a minimal topology for HA, with two physical servers deployed in each layer of the topology for a
total of 12 servers from the firewall to the database and LDAP servers. While the number of servers could be
reduced further by compression of the various layers, e.g. Network Dispatcher could be co-located on each
of the HTTP server machines, in practice, administration issues as well as security considerations (firewalls),
probably preclude this in practice.
An even more robust implementation is depicted below. In this case two WebSphere Domains are employed
to create a “gold standard” in high availability.
22
Domain 1
HTTP
WAS
WAS
WAS
App
App
App
Admin
Server
WAS
WAS
WAS
App
App
App
Admin
Server
ND
HTTP
Domain 2
HTTP
WAS
DB
WAS
WAS
WAS
App
App
App
Admin
Server
WAS
WAS
WAS
App
App
App
Admin
Server
WAS
WAS
WAS
App
App
App
Admin
Server
WAS
WAS
WAS
App
App
App
Admin
Server
WAS
WAS
WAS
App
App
App
Admin
Server
WAS
WAS
WAS
App
App
App
Admin
Server
ND
HTTP
App and
Session
DB
WAS
DB
Figure 1.10: “Gold Standard” WebSphere Domain
While there are no “hard” limits on the number of nodes that can be clustered in a WebSphere domain, one
may want to consider creating multiple WebSphere domains for a variety of reasons:
Ÿ
Two (or more) domains can be employed to provide not only hardware failure isolation, but software
failure isolation as well. This can come into play in a variety of situations:
•
Planned Maintenance
§ When deploying a new version of WebSphere. (Note that nodes running WebSphere V3.x and V4.x
in the same domain is not supported)
§ When applying an e-fix or patch.
§ When rolling out a new application or revision of an existing application.
Ÿ
In cases where an unforeseen problem occurs with the new software, multiple domains prevent a
catastrophic outage to an entire site. A rollback to the previous software version can also be accomplished
more quickly. Of course, multiple domains imply the software has to be deployed more than once, which
would not be the case with a single domain.
Ÿ
Multiple smaller domains may provide better performance than a single large domain, since there will be
less interprocess communication in a smaller domain.
23
Of course multiple domains will require more effort for day-to-day operations, since administration must be
performed on each domain. This can be mitigated through the use of scripts employing wscp and XMLConfig.
Multiple domains also mean multiple administration repositories (databases), which means multiple backups
here as well.
24
Chapter 2 - HTTP Server Failover
In a production WebSphere topology, static HTML content, servlets, and JSPs are invoked using the HTTP
protocol through an external HTTP server, such as IBM HTTP Server or Microsoft Internet Information
Server. This makes continuous and reliable operation of the HTTP server necessary in the design of a highly
available WebSphere solution. The configuration in figure 2.1 has only one web server and thus does not offer
any failover capability to the web server component.
HTTP
client
Internet
HTTP
Server
WebSphere Domain
(1 or more nodes)
Figure 2.1: Single web server configuration - no failover capability
In the event of this web server’s failure, all requests from the HTTP client (web browser) would fail to be
routed to the application server(s) running inside the WebSphere domain. High availability can be achieved by
using a cluster of HTTP servers and an IP sprayer to route requests to the multiple servers.
An IP sprayer transparently redirects incoming requests from HTTP clients to a set of HTTP servers. Although
the clients behave as if they are communicating directly with a given HTTP server, the IP sprayer is actually
intercepting all requests and distributing them among all the available HTTP servers in the cluster. Most IP
sprayers provide several different options for routing requests among the HTTP servers, from a basic
round-robin algorithm to complex utilization algorithms.
Figure 2.2 shows a basic configuration that implements this solution. Each machine in the topology is
configured with at least one physical IP address, and a loopback adapter configured with a shared virtual IP
address, sometimes called a cluster address. HTTP clients make HTTP requests on this virtual IP address.
These requests are routed to the primary sprayer which in turn sprays them to the cluster of web servers. The
cluster of web servers consists of identical web servers running on different physical machines. In the event of a
failure of one of the HTTP servers, the other HTTP server(s) will successfully accept all the future HTTP
requests from the IP sprayer.
It is also necessary to provide some sort of failover mechanism for the IP sprayer, to prevent this machine from
becoming a single point of failure. One option is to provide a backup IP sprayer, which maintains a heartbeat
with the primary sprayer. If primary sprayer fails, the backup sprayer will take over the virtual IP address and
process requests from HTTP clients.
25
Physical IP
10.0.0.4
Primary
IP
Sprayer
HTTP
client
Internet
Physical IP
10.0.0.5
HTTP
Server
WebSphere Domain
(1 or more nodes)
heart beat
Backup
IP
Sprayer
Physical IP
10.0.0.3
HTTP
Server
Physical IP
10.0.0.6
Virtual IP
10.0.0.2
Figure 2.2: Highly Available web server configuration
Some products also provide mutual high availability for an IP sprayer. In this configuration, both IP sprayers
route requests to the HTTP server machines. If one of the sprayers fails, the other will take over and route all
requests.
There are several IP sprayers available, including IBM’s Network Dispatcher, a part of the WebSphere Edge
Server product. For more information on working with Network Dispatcher, see Appendix A and
“WebSphere Edge Server: Working with Web Traffic Express and Network Dispatcher” (SG246172)
available from http://www.redbooks.ibm.com.
DNS round-robin may also be used to allow HTTP requests to be served by multiple HTTP servers.
However, this solution can introduce failover problems because DNS resolution may be cached by
intermediate DNS servers. In the case of failure, the cache may not be flushed quickly enough to allow
resolution to a different and functional server. Further more, some DNS round-robin implementations do not
provide failure detection of nodes being resolved. Without this capability, a subset of clients of the DNS may
be rerouted to the failed node.
26
Chapter 3 - Web Container Failover
Once an HTTP request has reached the HTTP server, it is necessary for a decision to be made. Some
requests for static content may be handled by the HTTP server. Requests for dynamic content or some static
content will be passed to a web container running in a WebSphere Application Server. Whether the request
should be handled or passed to WebSphere is decided by the WebSphere HTTP server plug-in, which runs
in-process with the HTTP server. For these WebSphere requests, high availability for the web container
becomes an important piece of the failover solution.
Incoming
request
HTTP
Server
plugin
App and
Session
DB
Web
Container
Admin
Server
WAS
DB
Figure 3.1: Single Web Container configuration, no failover capability
The configuration in Figure 3.1 has only one web container and provides no failover support. In this scenario, if
the web container fails, all HTTP requests which needed to be handled by WebSphere would fail.
High availability of the web container is provided by a combination of two mechanisms, the server group
support built into the WebSphere administration model, and the routing support built into the WebSphere
plug-in to the HTTP server.
3.1 WebSphere Server Groups
WebSphere Application Server 4.0 provides support for creating an application server template, called a
Server Group. Server groups can contain a web container, an EJB container (discussed in more detail in
Chapter 4), or both. From a server group, a WebSphere administrator can create any number of application
server instances, or clones, of this server group. These clones can all reside on a single node, or can be
distributed across multiple nodes in the WebSphere Domain. Clones can be administered as a single unit by
manipulating the server group object. WebSphere clones can share application workload and provide failover
support. If one of the clones fails, work can continue to be handled by the other clones in the server group, if
they are still available. See section 7.2.4 of the WebSphere InfoCenter and Chapter 17 of the “WebSphere
4.0 Advanced Edition Handbook” (SG246176) for more information on working with server groups.
3.2 The HTTP Server Plug-in
As mentioned in the previous chapter, a WebSphere environment may include several HTTP server instances.
Each HTTP server is configured to run the WebSphere HTTP plug-in in its process. There is a plug-in of
27
similar functionality designed for each HTTP server supported by WebSphere. Each request which comes into
the web server is passed through this plug-in, which uses its configuration information to determine if the
request should be routed to WebSphere, and if so, which web container the request should be routed to.
These web containers may be running on the same machine as the HTTP server, a different machine, or a
combination of the two.
Web
Container
Incoming
request
HTTP
Server
plugin
Admin
Server
Web
Container
Admin
Server
App and
Session
DB
WAS
DB
Figure 3.2: High Availability Web Container Configuration
The plug-in uses an XML configuration file (<WAS_HOME>/config/plugin-cfg.xml) to determine information
about the WebSphere domain it is serving. This configuration file is initially generated by the WebSphere
Administrative Server. If the web server is on a machine remote from the WebSphere Administrative Server,
as in the above diagram, the plugin-cfg.xml file will need to be moved to the web server machine. See the
WebSphere 4.0 InfoCenter for more details on WebSphere plug-in topologies.
When the HTTP server is started, the plug-in reads the information from the plugin-cfg.xml configuration file
into memory. The plug-in then periodically checks to see if the file has been modified, and reloads the
information if necessary. How often the plug-in checks this file is based on the RefreshInterval property defined
in the file. If this attribute is not present, the default value is 60 seconds.
<Config RefreshInterval=60>
In a production environment, the RefreshInterval should be set to a relatively high number, as the overhead of
the plug-in checking for a new configuration file frequently can adversely affect performance.
The plug-in also uses this configuration file to determine how to route HTTP requests to WebSphere web
containers. To more closely examine how the plug-in makes this determination, let’s look at an example.
Assume the plug-in receives a request for the URL http://www.mycompany.com/myapplication/myservlet.
The plug-in parses the request into two pieces, a virtual host (www.mycompany.com:80) and a URI
(/myapplication/myservlet). The plug-in then checks the plugin-cfg.xml file searching for a match for these two
items:
<UriGroup Name="myapplication web module">
<Uri Name="/myapplication"/>
</UriGroup>
<VirtualHostGroup Name="default_host">
28
<VirtualHost Name="*:80"/>
<VirtualHost Name="*:9080"/>
</VirtualHostGroup>
Next the plug-in searches for a route entry that contains both the <UriGroup> and the <VirtualHostGroup>
entry. This route is used to determine the <ServerGroup> which can handle the request.
<Route ServerGroup="MyApplication Server Group" UriGroup="myapplication web module"
VirtualHostGroup="default_host"/>
A <ServerGroup> represents either a standalone server or a group of WebSphere clones which are available
to service the request. Once the <ServerGroup> has been determined, the plug-in must choose a <Server>
within the group to route the request to.
<ServerGroup Name="MyApplication Server Group">
<Server CloneID="t31ojsis" Name="Server1">
<Transport Hostname="myhost1.domain.com" Port="9089" Protocol="http"/>
</Server>
<Server CloneID="t31pl23p" Name="Server2">
<Transport Hostname="myhost2.domain.com" Port="9084" Protocol="http"/>
</Server>
</ServerGroup>
The plug-in first checks to see if the client request has an HTTP session associated with it. If so, the CloneID is
parsed from the end of the session ID and compared with the CloneIDs for the <Server>s in the
<ServerGroup> until a match is found. The request is then routed to that <Server>.
If no session ID is associated with the request, the plug-in sends the request to the next <Server> in its routing
algorithm. The routing algorithm is either Round Robin or Random, based on the LoadBalance attribute of the
<ServerGroup>. If this attribute is not present, the default value is Round Robin.
<ServerGroup Name="MyApplication Server Group" LoadBalance="Random">
Once the <Server> has been determined, the <Transport> must be chosen for communications. Transports
define the characteristics of the connections between the web server and the application server, across which
requests for applications are routed. If the <Server> only has one <Transport> configured, this decision is
easy. It is possible, however, for the <Server> to have two transports configured, one for HTTP
communication and one for HTTPS communication.
<Server CloneID="t31ojsis" Name="Server1">
<Transport Hostname="myhost1.domain.com" Port="9089" Protocol="http"/>
<Transport Hostname="myhost1.domain.com" Port="9079" Protocol="https">
<Property name="keyring" value="C:\WebSphere\AppServer\etc\plugin-key.kdb"/>
<Property name="stashfile" value="C:\WebSphere\AppServer\etc\plugin-key.sth"/>
</Transport>
</Server>
29
In this case, the <Transport> communication between the plug-in and the web container is matched to the
communication protocol used between the browser and the web server. The URL
http://www.mycompany.com/myapplication/myserlvet would be sent from the plug-in to the web container
using the HTTP transport while https://www.mycompany.com/myapplication/myserlvet would be sent using the
HTTPS transport.
3.3 Web Container Failover
When a web container fails, it is the responsibility of the HTTP server plug-in to detect this failure and mark
the web container unavailable. Web container failures are detected based on the TCP response values, or lack
of response, to a plug-in request. There are four types of failover scenarios for web containers:
Ÿ
Ÿ
Ÿ
Ÿ
Expected outage of the application server - The application server containing the web application is
stopped from one of the administrative interfaces (administrative console, XMLConfig, WSCP). There is
currently an issue with application server shutdown under heavy load. During the shutdown process, a
small number of client requests may fail with a 404 - File Not Found error. This problem is being
investigated and a fix will be made available.
Unexpected outage of the application server - The application server crashes for an unknown reason.
This can be simulated by killing the process from the operating system.
Expected outage of the machine - WebSphere is stopped and the machine is shutdown.
Unexpected outage of the machine - The machine is removed from the network due to shutdown,
network failure, hardware failure, etc.
In the first two cases, the physical machine where the web container is supposed to be running will still be
available, although the web container port will not be available. When the plug-in attempts to connect to the
web container port to process a request for a web resource, the machine will refuse the connection, causing
the plug-in to mark the <Server> unavailable.
In the second two cases, however, the physical machine is no longer available to provide any kind of response.
In these cases, the plug-in must wait for the local operating system to timeout the request before marking the
<Server> unavailable. While the plug-in is waiting for this connection to timeout, requests routed to the failed
<Server> appear to hang.
The default value for the TCP timeout varies based on the operating system. While these values can be
modified at the operating system level, adjustments should be made with great care. Modifications may result
in unintended consequences in both WebSphere and other network dependent applications running on the
machine. See Appendix B for details on viewing and setting the TCP timeout value for each operating system.
If a request to a <Server> in a <ServerGroup> fails, and there are other <Server>s in the group, the plug-in
will transparently reroute the failed request to the next <Server> in the routing algorithm. The unresponsive
<Server> is marked unavailable and all new requests will be routed to the other <Server>s in the
<ServerGroup>.
30
The amount of time the <Server> remains unavailable after a failure is configured by the RetryInterval property
on the <ServerGroup> attribute. If this attribute is not present, the default value is 60 seconds.
<ServerGroup Name="MyApplication Server Group" RetryInterval=1800>
When this RetryInterval expires, the plug-in will add this <Server> into the routing algorithm and attempt to
send a request to it. If the request fails or times out, the <Server> is again marked unavailable for the length of
a RetryInterval.
The proper setting for a RetryInterval will depend on the environment, particularly the value of the operating
system TCP timeout value and how many <Server>s are available in the <ServerGroup>. Setting the
RetryInterval to a small value will allow a <Server> which becomes available to quickly begin serving requests.
However, too small of a value can cause serious performance degradation, or even cause your plug-in to
appear to stop serving requests, particularly in a machine outage situation.
To explain how this can happen, let’s look at an example configuration with two machines, which we will call
A and B. Each of these machines are running two cloned <Server>s each. The HTTP server and plug-in are
running on an AIX box with a TCP timeout of 75 seconds, the RetryInterval is set to 60 seconds, and the
routing algorithm is Round Robin. If machine A fails, either expectedly or unexpectedly, the following process
occurs when a request comes in to the plug-in:
1. The plug-in accepts the request from the HTTP server and determines the <ServerGroup>.
2. The plug-in determines that the request should be routed to clone 1 on machine A.
3. The plug-in attempts to connect to clone 1 on machine A. Because the physical machine is down, the
plug-in waits 75 seconds for the operating system TCP timeout interval before determining that clone 1 is
bad.
4. The plug-in attempts to route the same request to the next clone in its routing algorithm, clone 2 on
machine A. Because machine A is still down, the plug-in must again wait 75 seconds for the operating
system TCP timeout interval before determining that clone 2 is bad.
5. The plug-in attempts to route the same request to the next clone in its routing algorithm, clone 1 on
machine B. This clone successfully returns a response to the client, over 150 seconds after the request was
first submitted.
6. While the plug-in was waiting for the response from clone 2 on machine A, the 60 second RetryInterval for
clone 1 on machine A expired, and the clone is added back into the routing algorithm. A new request will
soon be routed to this clone, which is still unavailable, and we will begin this lengthy waiting process again.
To avoid this problem, we recommend a more conservative RetryInterval, related to the number of clones in
your configuration. A good starting point is 10 seconds + (# of clones * TCP_Timeout). This ensures that the
plug-in does not get stuck in a situation of constantly trying to route requests to the failed clones. In the
scenario above, this setting would cause the two clones on machine B to exclusively service requests for 235
seconds before the clones on machine A were retried, resulting in a another 150 second wait.
3.4 Session affinity and session persistence
31
HTTP session objects can be used within a web application to maintain information about a client across
multiple HTTP requests. For example, on an online shopping website, the web application needs to maintain
information about what each client has placed in his or her shopping cart. The session information is stored on
the server, and a unique identifier for the session is sent back to the client as a cookie or through the URL
rewriting mechanisms.
The behavior of applications that utilize HTTP sessions in a failover situation will depend on the HTTP session
configuration within WebSphere. Each application server within WebSphere 4.0 has a Session Manager
Service which is accessible from the Services tab in the WebSphere Administrative Console. WebSphere 4.0
has three basic session configuration options.
No sessions - If the application does not need to maintain information about a client across multiple requests,
HTTP sessions are not required. Disabling this support within the WebSphere plug-in, while not necessary, will
improve performance. To disable session checking, remove the CloneID values from the <Server> entries.
<ServerGroup Name="MyApplication Server Group" RetryInterval=240>
<Server CloneID="t31ojsis" Name="Server1">
<Transport Hostname="myhost1.domain.com" Port="9089"
Protocol="http"/>
</Server>
<Server CloneID="t31pl23p" Name="Server2">
<Transport Hostname="myhost2.domain.com" Port="9084"
Protocol="http"/>
</Server>
</ServerGroup>
Session Affinity (without Session Persistence) - HTTP session affinity is enabled by default. When this
support is enabled, HTTP sessions are held in the memory of the application server containing the web
application. The plug-in will route multiple requests for the same HTTP session to the same <Server> by
examining the CloneID information which is stored with the session key. If this <Server> is unavailable, the
request will be routed to another <Server> in the <ServerGroup>, but the information stored in the session is
lost.
Session Persistence and Session Affinity - When Session Affinity and Session Persistence are both
enabled, HTTP sessions are held in memory of the application server containing the web application, and are
also periodically persisted to a database. The plug-in will route multiple requests for the same HTTP session to
the same <Server>. This <Server> can then retrieve the information from its in-memory session cache. If this
<Server> is unavailable, the request is routed to another <Server>, which reads the session information from a
database. How much session information is lost depends on the frequency of the persistence, which is
configurable on the application server. Persisting information to the database more frequently provides more
transparent failover, but may adversely affect your application performance. For more details on configuring
session persistence, see Chapter 15 of the “WebSphere V4.0 Advanced Edition Handbook”
(SG24-6176-00) available from http://www.redbooks.ibm.com.
By persisting HTTP sessions to a database, we introduce another point of failure, the database itself. High
availability for databases will be discussed in Chapters 6 and 7.
32
33
Chapter 4 - EJB Container Failover
Many J2EE applications rely on Enterprise JavaBeans (EJBs) to provide business logic. EJB clients can be
servlets, JSPs, stand-alone Java applications, or even other EJBs. When the EJB client is running in a web
container in the same Java Virtual Machine (JVM) as the EJB, as is often the case with servlets and JSPs, the
workload management of the web container resources will provide failover for both the HTTP resources and
the EJB container. However, if the EJB client is running in a different JVM, failover for the EJB container
becomes a critical piece of the overall failover of the environment.
EJB
Clients
Admin
Server
EJB
Container
WAS
DB
App and
Session
DB
Figure 4.1: Single EJB Container Configuration, no failover capability
The configuration in Figure 4.1 has only one EJB container and provides no failover support. In this scenario, if
the EJB container fails, all EJB requests would also fail.
High availability of the EJB container is achieved using a combination of the WebSphere server group support
and the Workload Management (WLM) plug-in to the WebSphere Object Request Broker (ORB). As
discussed in the previous chapter, WebSphere server groups allow multiple instances of a WebSphere
Application Server to be created from a template. These multiple application servers, or clones, have a single
administration point, and the ability to share workload.
4.1 The WLM plug-in
The mechanisms for routing workload managed EJB requests to multiple clones are handled on the client side
of the application. In WebSphere 4.0, this functionality is supplied by a workload management plug-in to the
client ORB. In contrast to previous WebSphere releases, there are no longer any changes required to the EJB
jar file to provide workload management, greatly simplifying the process of deploying EJBs in a workload
managed environment.
34
EJB
Container
EJB
Clients
EJB
Container
Admin
Server
WAS
DB
Admin
Server
App and
Session
DB
Figure 4.2 High Availability EJB Container Configuration
To better understand the EJB workload management process, let’s consider a typical workload managed EJB
call with the help of Figure 4.2.
1. An EJB client (servlet, Java client, another EJB) performs a JNDI lookup for the EJB.
java.lang.Object o = initialContext.lookup("myEJB");
The Interoperable Object Reference (IOR) of the EJB is returned back to the client by the JNDI service
running in the WebSphere Administrative Server. This IOR contains a WLM flag which indicates if the
EJB is participating in workload management If the WLM flag is set, the IOR will also contain a server
group name and routing policy information.
2. A method is invoked on the EJB. If this is the first time the WLM plug-in to the ORB has seen this server
group, it makes a request to the Administrative Server to obtain the list of available clones. This list is
stamped with an identifier called an epoch number.
3. The client ORB now has information about all the clones in the ServerGroup and it also has information
about the selection criteria for the EJBs. As the EJB client continues to make calls to the EJB object, the
WLM plug-in to the ORB routes these requests to the appropriate clones using the specified routing
policy. For more detail on the routing policies available in WebSphere 4.0, see chapter 17 of the “IBM
WebSphere V4.0 Advanced Edition Handbook.”
35
client JVM process
EJB
Container
Clone 1
1: EJB Lookup
4: EJB Requests
4: WLM EJB requests
WLM
Info
+
IOR
EJB
Container
Clone 2
N
WLM
Y/N?
Y
2: IOR of EJB + WLM Flag
JNDI
Lookup
AppServer1
process
AppServer2
process
Admin Server
process
3a: Request ServerGroup info
3b: Return ServerGroup info
ORB + WLM plug-in
Figure 4.3: Workload Managed EJB Invocations
This chapter will discuss what happens if failover of the EJB container. However, if the admin server fails
during one of the above steps, other problems can occur. See section Chapter 5 for information on handling
admin server failover.
4.2 EJB Container failover
EJB container failover is handled in combination with the WebSphere Administrative Server and the WLM
plug-in to the ORB. The administrative server is the parent process of all of the clones running on a machine. If
an existing clone becomes unavailable or a new clone becomes available, the administrative server updates its
server group information to reflect this, and assigns a new epoch number. This information is pushed to all of
the available clones in the configuration periodically. How often this push occurs is configured by the parameter
com.ibm.ejs.wlm.RefreshInterval set on the com.ibm.ejs.sm.util.process.Nanny.adminServerJvmArgs entry
in the <WAS_HOME>\bin\admin.config file. The default value is 300 seconds.
The administrative server does not push this information to the EJB clients, however. On the next client
request, one of the following will happen:
1. The next clone in the WLM routing algorithm is available.
i. The clone services the request.
ii. If the clone detects that the epoch number of the client does not match its current epoch number, the
new server group information is passed to the client along with the response.
2. The next clone in the WLM routing algorithm is no longer available.
36
i.
If the com.ibm.ejs.wlm.MaxCommFailures threshold has been reached for the clone, it is marked
unusable. By default, the MaxCommFailures threshold is 0, meaning that after the first failure, the clone
is marked unusable. This property can be set by specifying
-Dcom.ibm.ejs.wlm.MaxCommFailures=<number> on the command line arguments for the client
process.
ii. The WLM plug-in will not route requests to this clone until new server group information is received,
or until the expiration of the com.ibm.ejs.wlm.UnusableInterval. This property is set in seconds. The
default value is 900 seconds. This property can be set by specifying
-Dcom.ibm.ejs.wlm.UnusableInterval=<seconds> on the command line arguments for the client
process.
iii. The failed request is transparently routed to the next clone in the routing algorithm.
If the WLM plug-in cycles through all of its known servers and none of the clones are available, it makes a
request to the administrative server for the new server group information.
As with the HTTP server plug-in, the WLM plug-in needs to be able to detect both process and machine
failures. When an EJB container process fails, the RMI/IIOP port it was listening on will no longer be
available. When an EJB client attempts to communicate with the EJB container, the connection will be refused.
Machine failures are a bit more difficult to deal with. If the failure occurs before a connection has been
established, the operating system TCP timeout value (as discussed in appendix B) is used. If the failure occurs
after the connection is established, the client ORB must timeout communications. The amount of time the ORB
will wait before timing out is configured by the property com.ibm.CORBA.requestTimeout. The default
value for this property is 180 seconds. This default value should only be modified if your application is
experiencing timeout problems, and then great care must be taken to tune it properly. If the value is set too
high, the failover process will be very slow. Set it too low, and valid requests will be timed out before the EJB
container has a chance to respond. The factors that affect a suitable timeout value are the amount of time to
process an EJB request, and the network latency between the client and the EJB server. The time to process a
request depends on the application, and the load on the EJB server. The network latency depends on the
location of the client. Clients running within the same local area network as the application servers may use a
smaller timeout value to provide faster failover. Clients situated farther away on the intranet or internet need to
use a longer timeout to tolerate the inherent latency of long distance networks.
If the EJB client is running inside of a WebSphere Application Server, this property can be modified by editing
the request timeout field on the Object Request Broker property sheet. If the client is a Java client, the
property can be specified as a runtime option on the Java command line, for example :
java -Dcom.ibm.CORBA.requestTimeout=<seconds> MyClient
4.3 WLM and EJB Types
When a client accesses a workload managed EJB, it may be routed to one of a number of instances of the EJB
on a number of servers. Not all EJB references can be utilized this way. The table below shows the types of
EJBs that can be workload managed.
37
EJB Type
Entity Bean, Commit time caching Home
Option A
CMP
BMP
Able to be Workload Managed
No
No
No
Entity Bean, Commit time caching Home
Options B and C
CMP
BMP
Yes
Yes
Yes
Session Bean
Yes
Yes
No
Home
Stateless
Stateful
The only type of EJBs that can not be workload managed are instances of a given stateful session bean and
entity beans with commit time caching option A. A stateful session bean is used to capture state information
that must be shared across multiple client requests that are part of a logical sequence of operations. To ensure
consistency in the logic flow of the application, a client must always return to the same instance of the stateful
session bean, rather than having multiple requests workload managed across a group of available stateful
session beans.
Because of this restriction, stateful session bean instances are not considered failover-tolerant. If the process
running the stateful session bean fails, the state information maintained in that bean is unrecoverable. It is the
responsibility of the application to reconstruct a stateful EJB with the required state. If this is not the case, an
alternative implementation, such as storing state in an entity EJB, should be considered.
4.4 WLM and EJB Caching
In the case of Entity Beans, the EJB caching option in use plays a role in workload management. WebSphere
4.0 supports three caching options:
1. Option A caching: The EJB container caches an instance of the EJB in a ready state between
transactions. The data in the EJB is also cached, not requiring the data to be reloaded from the data store
at the start of the next transaction. The entity bean must have exclusive access to the underlying database,
which means that workload management will not behave properly in option A caching mode since multiple
clones of the same entity could potentially modify the entity data in inconsistent ways.
2. Option B caching: The EJB container caches an instance of the EJB in a ready state between
transactions, but invalidates the state information for the EJB. At the start of the next transaction, the data
will be reloaded from the database.
3. Option C caching (default): No ready instances are cached by the container, they are returned to the
pool of available instances after the transaction has completed.
38
Entity beans can be workload managed if they are loaded from the database at the start of each transaction.
By providing either option B caching or option C caching (default), the entity beans can be made to participate
in WLM. These two caching options ensure that the entity bean is always reloaded from the database at the
start of each transaction.
In WebSphere 4.0, the caching options are configured using the Application Assembly Tool (AAT).
The caching options (A, B or C) are determined by the combination of options selected in the drop down
menus Activate at and Load at. The table below shows the values that represent the three caching options.
Option
A
B
C
Invalid
Activate at
Once
Once
Transaction
Transaction
Load at
Activation
Transaction
Transaction
Activation
For more information on how EJB caching options within WebSphere, see the WebSphere InfoCenter or the
“WebSphere V4.0 Advanced Edition Handbook.”
39
Chapter 5 - Administrative Server Failover
5.1 Introduction
The WebSphere Administrative Server (admin server) runs in a JVM on each node in a WebSphere
Administrative Domain. The admin server is responsible for providing run-time support for JNDI service,
security, transaction logging, Location Service Daemon, and EJB workload management. The admin server
also provides system management support initiated from administrative interfaces, including the Administrative
Console, XMLConfig, and WSCP. Communications with the admin server take place using RMI/IIOP.
In this chapter, we describe how the failure of the admin server affects the different types of clients, and how to
enable high availability for these clients.
Java Client
Admin
Server
Admin
Console
WAS
DB
XMLConfig
Admin
Server
WSCP
Figure 5.1: Administrative Server Failover
The first line of defense against admin server failure is the nanny process. If the admin server crashes, the
majority of time it is restarted automatically by the nanny process. Clients of the admin server will fail only for
the duration required for failure detection and restart.
In the very rare case where the admin server is unable to restart, or in the case where the node hosting the
admin server crashes, WebSphere 4.0 provides the capability to workload manage admin servers, allowing the
clients to route requests around a failed admin server to a different admin server on a different node. This
requires a configuration consisting of at least two nodes, with one admin server on each node. The only case
where admin server WLM is not supported is transaction logging for two phase commit transaction
coordination. These logs are required to be on the same node as the transaction coordinator.
One does not explicitly create a server group for the WebSphere Administration Server as is done with
Application Servers. The processes of installing WebSphere on multiple machines, all pointing to a single
40
configuration database, creates a WebSphere cluster. The WLM capability is provided by setting the following
property in the admin.config:
com.ibm.ejs.sm.adminServer.wlm=true
This property is set to true by default. Changing the value to false disables the workload management support.
For correct behavior, this property must be set to the same value on all admin servers in the WebSphere
domain.
Admin server workload management uses a special workload management policy different from those used by
regular EJBs. Once a client establishes a connection to an admin server, it continues to communicate with that
same admin server, unless that server becomes unavailable for some reason. In that case, the client will be
routed to one of the other admin servers in the domain, and will continue making calls to that server unless it
becomes unavailable. If the client to the workload managed admin servers is running on the same machine as
one of the servers, by default the client will always communicate with this local server first.
5.2 Runtime Support in the Administrative Server
One of the primary functions of the admin server is to provide runtime support for applications running inside
the WebSphere domain. The admin server provides support for JNDI service, Location Service Daemon,
security, transaction logging, and EJB workload management. We will first describe how the different types of
clients make use of the services in the admin server. We will then describe how to enable high availability for
these services.
5.2.1 Types of clients that use the admin server
5.2.1.1 Application server
The application server interacts with the admin server in many different ways. During server initialization, it
obtains configuration information from the admin server about the modules to host on the application server. It
contacts the bootstrap server within the admin server to obtain the InitialContext. This is used by the
application server to bind homes into JNDI namespace. If the admin server fails during server initialization, the
server is unable to start.
Applications within the application server, such as EJBs or servlets, may also contact the bootstrap server and
the name server when making use of the JNDI service. If the application object has not yet obtained an
InitialContext object when the failure occurs, JNDI lookups will be unsuccessful. If the IntialContext has been
obtained, the application will be workload managed to another admin server, if one is available.
When communicating with a workload managed EJB home or EJB instance, the WLM runtime may call the
admin server to obtain the most recent information about the servers in the server group. This occurs when first
making use of an object reference, or as a last resort, when trying all the servers in the server group fail. Failure
of the admin server can prevent the client WLM runtime from reaching a functioning server due to lack of
server group information.
41
If the application server calls the home of an EJB that is not workload managed, the call is first directed to the
Location Service Daemon (LSD) on the admin server where the EJB is hosted. Failure of the LSD prevents
the server from completing the call.
With security enabled, the application server contacts the security server within the admin server when
validating or revalidating the security credentials. Failure of the admin server prevents the application server
from validating a new client, or from revalidating an old client whose credentials have expired.
The admin server is used by the transactions manager to perform logging for two phase commit operations. If
the application server makes use of two phase commit and the administrative server crashes after all
participants are in a prepare state and before the commit takes place, it is possible to have rows in the
database locked until the transaction can be recovered using the transaction log.
5.2.1.2 EJB clients
EJB calls may be initiated from a servlet, EJB, J2EE application client, or an unmanaged thin client. They call
the admin server’s bootstrap and name server when making use of the JNDI service to look up EJB homes. If
calling workload managed EJB homes or instances, the WLM runtime in the clients may also contact the admin
server for server group information, as discussed previously.
EJB clients also make use of the LSD in the admin server when calling an EJB home that is not workload
managed.
5.2.2 Enabling High Availability
5.2.2.1 JNDI Client Failover Behavior
Access to many WebSphere objects, including EJBs and resources such as database connections, are
provided through JNDI lookups. Accesses to resources do not pose a high availability concern, because they
are created locally in the same process as the clients who uses them. Access to EJB home references involve a
remote call to the name server residing in the admin server. The code for this lookup would look similar to the
following:
1:
2:
3:
4:
p.put( "javax.naming.Context.PROVIDER_URL", "iiop://myserver1.domain.com:900");
InitialContext ic = new InitialContext(p);
Object tmpObj = ic.lookup("ejb/myEJB");
myEJBHome beanhome =
(myEJBHome)javax.rmi.PortableRemoteObject.narrow((org.omg.CORBA.Object)tmpObj,
myEJBHome.class);
5: myEJB mybean = beanhome.create();
Code Sample 5.1
42
The creation of the InitialContext object (line 3) is performed by the JNDI runtime by first bootstrapping to the
bootstrap server at port 900 of the admin server at myserver1.domain.com. This returns to the client a
workload managed InitialContext that can be used by the client to perform lookup of EJB homes at line 4.
The first high availability considerations for using the JNDI service is the initial bootstrap, which may fail if the
admin server or the node hosting the admin server fails while line 3 is being executed. In general, the client has
no good way of knowing what has transpired to cause the failure. The best course of action is for the client to
catch an exception, and re-bootstrap to a different bootstrap server. This means that the application needs to
be aware of more than one admin server in the domain, and be coded to bootstrap to a different bootstrap
server.
The code sample below provides an example of a client coded to bootstrap to multiple administrative servers.
It replaces lines 1 and 2 of sample 5.1. This code assumes that an array (hostURL) has been created which
contains a list of bootstrap servers and bootstrap ports in colon separated format:
localhost:900
myserver:1800
The client attempts to bootstrap to the first server in the array. If this fails due to a NamingException, the next
server in the list is tried. If all servers fail, the client prints a message and exits.
int i = 0 ;
Properties sysProps = System.getProperties();
Properties p = sysProps.clone(); //won't change system properties
while ( i < hostURL.length )
{
try{
p.put(Context.PROVIDER_URL,"iiop://"+ hostURL[i]);
theRootContext = new InitialContext(p);
break;
}catch(NamingException ex)
{
if ( i == (hostURL.length - 1))
{
System.out.println("All servers are not available, program terminating...");
System.exit(1);
}
else
{
System.out.println("Server " + hostURL[i] + " is not available, will use next one");
i ++ ;
}
}
}
Code Sample 5.2
Bootstrapping can also occur implicitly. For example, application servers need to bootstrap to the admin
server to get an InitialContext during server startup. If this bootstrap fails, the application server will be unable
43
to start. We rely on clustering of the application servers to ensure that other servers are available to process
client requests.
If line 1 of code sample 5.1 is omitted, the default bootstrap server is used, typically localhost at port 900. For
an application client, failover is accomplished by restarting the client to bootstrap to a different admin server:
LaunchClient ... -CCBootstrapHost=myserver2.domain.com -CCBootstrapPort=900
In this example, the launchClient command is used to start the client to bootstrap to myserver2.domain.com, at
port 900. The same is achieved by changing the application client properties file:
BootstrapHost=myserver2.domain.com
BootstrapPort=port
The default bootstrap location for a thin Java client may be changed by changing the ORB system properties
when starting the client from the command line:
java -Dcom.ibm.CORBA.BootstrapHost=myserver2.domain.com -Dcom.ibm.CORBA.BootstrapPort=900 . . .
Once the initial bootstrap is completed, the second high availability consideration is the actual lookup of the
EJB home at line 4. If the admin server fails, the code performing the lookup will get an exception. This
problem is addressed by turning on admin server WLM, and writing additional code to retry. Since the
InitialContext itself is workload managed, retrying line 4 allows the WLM runtime to failover to a different
admin server, allowing the lookup operation to complete successfully.
The call to the create method on line 6 is the first call which invokes the EJB server group. If this call fails, this
indicates an EJB container failure. EJB container failures are addressed in more detail in Chapter 4.
There are four types of administrative server failover scenarios:
Ÿ
Ÿ
Ÿ
Ÿ
Expected outage of the administrative server - The admin server is stopped from one of the admin
interfaces (admin console, XMLConfig, WSCP).
Unexpected outage of the administrative server - The administrative server crashes for an unknown
reason. This can be simulated by killing the process from the operating system.
Expected outage of the machine - WebSphere is stopped and the machine is shut down.
Unexpected outage of the machine - The machine is removed from the network due to shutdown,
network failure, hardware failure, etc.
In the first two cases, the physical machine where the admin server was running is still available. When a client
attempts to make a connection to this failed admin server, the remote machine will refuse the connect attempt
right away. Bootstrapping to a different server or retrying the lookup allows the client to route the request to
one of the other servers.
44
In the second two cases, the physical machine where the admin server was running is no longer available. In
this case, the client relies on the ORB to timeout the connection. The same tuning parameters discussed for
EJB workload management in chapter 4 can be applied to admin server workload management.
5.2.2.2 Location Service Daemon (LSD) Behavior
When a WebSphere application server is started, it allocates a listener port dynamically to process ORB
requests. This listener port is registered with the LSD. When an EJB home that is not workload managed is
looked up though JNDI, the object reference contains the host and port of the LSD. The very first request to
an EJB home is first routed to the LSD, which reroutes the request to the real EJB home. This extra level of
indirection enables dynamic port assignment for EJB servers, while allowing the object reference, which is
bound to the LSD, to be reusable across server restarts. Since EJB homes that are not workload managed
only reside on one application server, the node where the server resides already constitutes a single point of
failure. The use of the LSD is just one more single point of failure.
To properly address high availability, at least the EJB homes itself needs to be workload managed. If the bean
is workload managed, retrying an operation after a failure allows the client to failover. If only the home is
workload managed, retrying on the home allows the client to failover to a different home to create or find
beans to work with. With workload managed homes or beans, the WLM runtime locates the current host/port
of the application servers to route a request, bypassing the LSD altogether. This avoids the LSD as a single
point of failure.
5.2.2.3 Security Server
An application server uses the security server in the admin server to validate or revalidate security credentials.
Enabling admin server WLM allows the security runtime in the application servers to failover to a different
security server. The client may have to retry during credential validation during failover.
There is currently an issue with security failover in WebSphere 4.x. If the admin server fails, application
servers on that node are unable to failover to a backup admin server for credential validation. This is being
fixed under APAR PQ55817. With this fix applied, failover should behave properly.
5.2.2.4 Transaction logging
If the application makes use of more than one transactional resource, transaction logging is performed via the
admin server during two phase commit. Note that the transaction manager will not initiate two phase commit if
an application touches just one resource, even for XA capable resources. If the admin server fails prior to the
start of the two phase commit protocol, the transaction manager will wait for the admin server to be restarted,
so that it is able to perform the logging. If the admin server fails after the start of the two phase commit
protocol, the transaction manager will also wait for the admin server to be restarted so as to complete the
logging required for two phase commit. If the application server itself fails, it will be restarted to automatically
complete the two phase commit protocol.
45
If the node hosting the admin server fails after the start of the two phase commit protocol, the transaction can
not be completed since the log resides only on the failed node, and there is no backup process capable of
accessing the log. In this case, database rows used by the transaction are locked until the transaction can be
completed. When this occurs, the node needs to be fixed, and the admin and application servers restarted.
As an alternative, another node with identical configuration and the same host name may be configured, with
the transaction log copied over to the new node. It can then be restarted to complete the transaction. The
transaction logs are located in the "tranlog" directory of the WebSphere installation.
5.2.2.5 Starting an application server
If the admin server fails while starting an application server, the server is unable to start. Utilizing horizontal or
vertical clustering allows the clients to failover to a different server within the server group.
5.3 System Administration Support in the Administrative Server
The second major function of the admin server is to provide a single administration point for all of the objects
within the WebSphere domain. The admin server allows for administration of information stored in the
WebSphere repository.
5.3.1 Types of System Administration Clients
5.3.1.1 Administrative Console (admin console)
The admin console is the graphical user interface for systems administration. It uses the bootstrap server and
the name server in the admin server to perform JNDI operations. User operations on the admin console are
relayed to the EJBs on the admin server, and possibly indirectly to another admin server if the request affects
another node. If security is enabled, the security runtime in the admin server uses the security server in the
admin server to authenticate the user.
If the admin server fails during admin console initialization, the console will fail to bootstrap to the admin server.
This prevents the console from starting up altogether. Failover is achieved by restarting the admin console to
bootstrap to a different admin server. The command line options are:
adminclient -host host [-port port]
where the default port number is 900.
After the admin console successfully starts up, workload management of the admin server takes effect. If an
operation on the admin console fails, retrying the console allows the workload management runtime to reroute
the request to a different admin server to complete the operation. Note that requests involving the applications
on the node where the admin server failed, such as querying the state of the servers on that node, will fail until
the admin server is restarted. But requests affecting other nodes will succeed.
5.3.1.2 XMLConfig
46
XMLConfig allows the system administrator to perform application management functions, such as application
installation. If the admin server fails, XMLConfig operations will fail as well. If the operations of XMLConfig
affect the node where the admin server is running, the admin server needs to be restarted before those
operations can succeed. However, if the operations do not affect the applications on the same node as the
failed admin server, XMLConfig may be rerun to bootstrap to a different admin server with the following
parameters:
-adminNodeName host
There is a currently a problem in which XMLConfig may attempt to contact the failed administrative server to
process a cloned application, even if that application is also available on one of the other nodes. This is being
investigated by development and will be fixed in a future release.
5.3.1.3 WebSphere Control Program
WebSphere Control Program (WSCP) is used by the system administrator to perform administration
operations, such as stopping and starting servers. If the operation affects the node where the admin server fails,
the admin server needs to be restored before the operation can complete. If the operation does not affect the
node of the failed admin server, wscp can be rerun to bootstrap to a different admin server by setting the
following properties either from the command line, or via the configuration property file:
wscp.hostName=host
5.3.2 Known limitations of System Administration Failover
Due to the distributed nature of WebSphere, it is possible that an admin server may need to retrieve
information about an application on a different node. It may also have to delegate administration functions to
the admin server hosted on the same node as the affected application servers. To do this, the admin server
opens a communication link to the admin server on the remote node. If the admin server on the remote node is
unavailable, the operations will not complete successfully.
5.4 Configuration parameters which affect admin server failover
Some settings can be configured on the WLM client to influence how the failover occurs. These settings
become especially important in the case where network connection to an administrative server is lost. In
general, the default values for these parameters can be accepted. However, if you see unusual behavior or
have a very slow or very fast network, you may want to consider adjusting the following parameters on the
client:
Ÿ
Ÿ
TCP/IP timeout affects the time required to get an exception while bootstrapping to a node that has
been shutdown, or disconnected from the network. This setting is covered in Appendix B.
com.ibm.CORBA.requestTimeout - This property specifies the time-out period for responding to
workload management requests. This property is set in seconds.
47
Ÿ
Ÿ
com.ibm.ejs.wlm.MaxCommFailures - This property specifies the number of attempts that a workload
management client makes to contact the administrative server that manages workloads for the client.
The workload management client run time does not identify an administrative server as unavailable until
a certain number of attempts to access it have failed. This allows workload management to continue if
the server suffers from transient errors that can briefly prevent it from communicating with a client. The
default value is 0.
com.ibm.ejs.wlm.UnusableInterval -This property specifies the time interval that the workload
management client run time waits after it marks an administrative server as unavailable before it
attempts to contact the server again. This property is set in seconds. The default value is 900 seconds.
See InfoCenter article
http://www-4.ibm.com/software/webservers/appserv/doc/v40/ae/infocenter/was/070206.html for more
information on these parameters.
48
Chapter 6 - WebSphere Database Failover
6.1 – Introduction
HTTP
client
HTTP Servers,
WebSphere
Application and
Administration
Servers
Internet
WAS
DB
App and
Session
DB
Firewall
Figure 6.1: Single WebSphere Database Topology
As noted in the introduction to this paper there are two types of availability, data availability and process
availability. High availability for a database server encompasses both of these. Data availability is the domain of
the database manager, while process availability falls into the domain of one of the clustering technologies
employed to make the database server process highly available. The specifics of configuring your database
server for high availability are addressed in chapter 7. However, it is important to note that even in a
WebSphere environment which employs one of the HA mechanisms discussed in chapter 7, there is still an
interruption in service while the database is switched from a failed server to an available server. This chapter
will focus on application code implications for HA as well as administrative options in the WebSphere runtime
to allow application toleration of a database outage.
WebSphere Application Server adds some key components that allow it to survive a database server failure,
or more precisely the interruption and subsequent restoration of service by the database server. These
components take two forms; the first are some IBM extensions to the JDBC 2.0 API. These extensions allow
WebSphere and applications running within WebSphere the capability to easily reconnect to a database server
once it has recovered from a failure, or to recognize that the database is not responding to requests. The first
JDBC extension is the com.ibm.websphere.ce.cm.StaleConnectionException. The StaleConnectionException
maps the multiple SQL return codes that occur in the event of a database outage to a single exception. Not
only does the WebSphere runtime (Admin Server) utilize this mechanism, but it is provided as part of the
WebSphere programming model for application components (Servlets, JSPs, and EJBs) running in
WebSphere. This extension to JDBC allows applications running in WebSphere and WebSphere itself to
reconnect to a database after service is restored. The com.ibm.ejs.cm.pool.ConnectionWaitTimeoutException
is the second JDBC extension. This exception occurs if the connection timeout parameter for the datasource is
exceeded. This parameter specifies the amount of time an application will wait to obtain a connection from the
pool if the maximum number of connections is reached and all of the connections are in use.
49
The second component enables applications running within WebSphere to continue to function while the
WebSphere Administrative Repository database is unavailable. Because the WebSphere Administrative
Repository stores the WebSphere JNDI name space, an outage of this repository can also cause an inability to
perform JNDI lookups. WebSphere provides a runtime JNDI cache available in application clients and
application servers, which allows the JNDI service to continue while the administration database is unavailable.
6.2 – Application Databases
6.2.1 StaleConnectionException
From a WebSphere application code perspective, stale connections are those connections that for some
reason can no longer be used. This can happen, for example, if the database server is shut down or if the
network is experiencing problems. In all of these cases, the connections are no longer usable by the application
and the connection pool needs to be flushed and rebuilt. This type of support was added in Version 3.5.2, and
is improved in Version 4.0. More vendor-specific error codes have been added to the mapping that results in
a StaleConnectionException. In addition, when a StaleConnectionException is detected, the entire pool of
connections is automatically destroyed.
Explicitly catching a StaleConnectionException is not required by most applications. Because applications are
already required to catch java.sql.SQLException, and StaleConnectionException extends SQLException,
StaleConnectionException is automatically caught in the general catch-block. However, explicitly catching
StaleConnectionException makes it possible for an application to perform additional recovery steps from bad
connections. Before continuing, it is important to understand that WAS will automatically reconnect to the
database when future requests arrive without any application intervention. The use of the
StaleConnectionException exception should be limited to applications that feel they need an extra level of
transparency in addition to the automatic recovery provided by WAS. This transparent recovery code can be
very complex to develop in more complex scenarios involving multiple resources or complex logic flows. The
following discussion is aimed at teams that feel that this additional complexity is warranted.
The most common time for a StaleConnectionException to be thrown is the first time that a connection is used,
just after it is retrieved. Because connections are pooled, a database failure is not detected until the operation
immediately following its retrieval from the pool, which is the first time communication to the database is
attempted. And it is only when a failure is detected that the connection is marked stale.
StaleConnectionException occurs less often if each method that accesses the database gets a new connection
from the pool. Examining the sequence of events that occur when a database fails to service a JDBC request
shows that the failure occurs less often because all connections currently handed out to an application are
marked stale. The more connections the application has, the more StaleConnectionExceptions which occur.
Generally, when a StaleConnectionException is caught, the transaction in which the connection was involved
needs to be rolled back and a new transaction begun with a new connection. Details on how to do this can be
broken down into three categories:
• A connection in auto-commit mode
• A connection not in auto-commit and transaction begun in the same method as database access
• A connection not in auto-commit and transaction begun in a different method from database access
50
6.2.1.1 Connections in auto-commit mode
By default, any connection obtained from a one-phase datasource (implementing
javax.sql.ConnectionPoolDataSource) is in auto-commit mode when there is no scoping transaction. When in
auto-commit mode, each database action is executed and committed in a single database transaction. Servlets
often use connections in auto-commit mode, because transaction semantics are not necessary. Enterprise
applications do not usually use connections in auto-commit mode. Auto-commit can be explicitly disabled by
calling setAutoCommit() on a Connection object. When a StaleConnectionException is caught from a
connection in auto-commit mode, recovery is a simple matter of closing all of the associated JDBC resources
and retrying the operation with a new connection. Note: In some cases the cause of the database outage might
be transient. In these cases, it might be worthwhile to add a pause to the retry logic to allow for database
service restoration. The number of retries as well as any pause should be kept small so as to not keep a web
site user waiting indefinitely.
An example of this follows:
public void myConnPool() throws java.rmi.RemoteException
{
// retry indicates whether to retry or not
// numOfRetries states how many retries have been attempted
boolean retry = false;
int numOfRetries = 0;
java.sql.Connection conn = null;
java.sql.Statement stmt = null;
do {
try {
//Assumes that a datasource has already been obtained from JNDI
conn = ds.getConnection();
stmt = conn.createStatement();
stmt.execute("INSERT INTO ORG VALUES (10, 'Pacific', '270', 'Western', 'Seattle')");
retry = false;
} catch(com.ibm.websphere.ce.cm.StaleConnectionException sce)
{
//if a StaleConnectionException is caught rollback and retry the action
if (numOfRetries < 2) {
retry = true;
numOfRetries++;
// add an optional pause
sleep(10000) ;
} else {
retry = false;
}
} catch (java.sql.SQLException sqle) {
//deal with other database exception
} finally {
//always cleanup JDBC resources
try {
if(stmt != null) stmt.close();
} catch (java.sql.SQLException sqle) {
//usually can ignore
51
}
try {
if(conn != null) conn.close();
} catch (java.sql.SQLException sqle) {
//usually can ignore
}
}
} while (retry) ;
}
6.2.1.2 Connections not in auto-commit mode
If a connection does not have auto-commit enabled, multiple database statements can be executed in the same
transaction. Because each transaction uses a significant number of resources, fewer transactions result in better
performance. Therefore, if a connection is used for executing more than one statement, turn off auto-commit
mode and use transactions to group a number of statements into one unit of work. Keep in mind that if a
transaction has too many statements, the database can experience problems due to lack of memory.
6.2.1.2.1 Transactions started in the same method
If a transaction is begun in the same method as the database access, recovery is straightforward and similar to
the case of using a connection in auto-commit mode. When a StaleConnectionException is caught, the
transaction is rolled back and the method retried. If a StaleConnectionException occurs somewhere during
execution of the try block, the transaction is rolled back, the retry flag is set to true, and the transaction is
retried. As is the case with connections in auto-commit mode the number of reties should be limited as well as
any pause, because the exception might not be transient. This is illustrated in the following:
do {
try {
//begin a transaction
tran.begin();
//Assumes that a datasource has already been obtained from JNDI
conn = ds.getConnection();
conn.setAutoCommit(false);
stmt = conn.createStatement();
stmt stmt.execute("INSERT INTO ORG VALUES (10, 'Pacific', '270', 'Western', 'Seattle')");
tran.commit();
retry = false;
} catch(com.ibm.websphere.ce.cm.StaleConnectionException sce)
{
//if a StaleConnectionException is caught rollback and retry the action
try {
tran.rollback();
} catch (java.lang.Exception e) {
//deal with exception in most cases, this can be ignored
}
// deal with other database exceptions and clean up as before
}
}
52
6.2.1.2.2 Transactions started in a different method
When a transaction is begun in a different method from the database access, an exception needs
to be thrown from the data access method to the transaction access method so that it can retry the operation.
In an ideal situation, a method can throw an application-defined exception, indicating that the failure can be
retried. However this is not always allowed, and often a method is defined only to throw a particular
exception. This is the case with the ejbLoad and ejbStore methods on an enterprise bean.
A more comprehensive discussion of each of these scenarios as well as code samples is available in the
Whitepaper; “WebSphere Connection Pooling”, which is available from:
http://www-4.ibm.com/software/webservers/appserv/whitepapers/connection_pool.pdf
6.2.1.3 Connection Error Recovery
A servlet or Bean Managed Persistent (BMP) entity EJB coded to catch a StaleConnectionException, or a
Container Managed Persistence (CMP) entity EJB, for which the container catches the exception, should
recover from a StaleConnectonException unnoticed from an application perspective. Messages will be
displayed as depicted below in the Administration console.
53
Figure 6.2: StaleConnectionException displayed in Administrative Console
6.2.2 ConnectionWaitTimeoutExceptions
While the StaleConnectionException provides a mechanism for recovery once the database manager process
is restored, a related issue is avoiding a “frozen website”. A website is “frozen”, or appears to be, when servlet
threads are waiting for connections from the connection pool, and as viewed from the web browser nothing is
happening. This can occur for any number of reasons, but for database access it’s important that applications
be coded to catch com.ibm.ejs.cm.pool.ConnectionWaitTimeoutException. This exception occurs if the
Connection Timeout parameter for the datasource is exceeded. This parameter specifies the amount of time
an application will wait to obtain a connection from the pool if the maximum number of connections is reached
and all of the connections are in use. Though the website has not actually suffered an outage, this appears to be
the case from the perspective of the end user. Coding the application to return a message to the user and
tuning the connection timeout value for the datasource can significantly improve the end user experience with a
web site under heavy load. The default value for this parameter is 180 seconds (3 minutes), though any
non-negative integer is a valid value. Setting this value to 0 disables the connection timeout..
This value can also be changed programmatically by calling setLoginTimeout() on the datasource. If
setLoginTimeout is called on the datasource, this sets the timeout for all applications that are using that
datasource. For this reason, it is recommended that setLoginTimeout not be used. Instead, the connection
54
timeout property should be set on the datasource during configuration. The value for this parameter is specified
on the Connection Pooling Tab for each Data Source as depicted below:
Figure 6.3: Connection timeout configuration
An application sample for handling this exception is depicted below:
java.sql.Connection conn = null;
javax.sql.DataSource ds = null
try {
//Retrieve a DataSource through the JNDI Naming Service
java.util.Properties parms = new java.util.Properties();
setProperty.put(Context.INITIAL_CONTEXT_FACTORY,
"com.ibm.websphere.naming.WsnInitialContextFactory");
//Create the Initial Naming Context
javax.naming.Context ctx = new
javax.naming.InitialContext(parms);
//Lookup through the naming service to retrieve a
//DataSource object
javax.sql.DataSource ds = (javax.sql.DataSource) ctx.lookup("java:comp/env/jdbc/SampleDB");
conn = ds.getConnection();
//work on connection
} catch (com.ibm.ejs.cm.pool.ConnectionWaitTimeoutException cw) {
//notify the user that the system could not provide a
//connection to the database
} catch (java.sql.SQLException sqle) {
//deal with exception
}
6.3 Session Database
55
The default for WebSphere Application server is to store HTTP session objects in memory. Even with the
improvements in the session affinity implementation in WebSphere V4.x, placing HTTP session objects in
memory does not provide a mechanism for failover of requests associated with that HTTP session object in the
event of the shutdown or failure of an application server process.
The Session Manager in WebSphere Application Server provides an option for using a database as the
mechanism for storing HTTP session objects. This implementation, which utilizes updates to HTTP Session
objects in memory, with periodic updates of session information to a database, provides a scalable mechanism
for failover of the HTTP session object to an alternative application server. In this implementation all
application servers in a WebSphere cluster are candidates for failover of the request associated with the HTTP
session object. For applications that use the Servlet HTTPSession API, the database used to persist the
HttpSession object represents a potential single point of failure.
Fortunately with WebSphere V4.x there are a number of improvements in the Session Manager that allow for
tuning of the frequency of updates to the database used for persistence of the HTTP session object. These
options provide better scalability and failover capabilities than available in previous releases of WebSphere.
In brief, the Session Manager may be tuned to update the database:
• At the end of the servlet service method (the default).
• Manually (this requires use of an IBM extension to the HttpSession API).
• At the end of a specified time interval
The default behavior for update of the database at the end of the servlet service method implies that the
database must be continuously available in order to service requests. Even highly available databases are not
continuously available, hence correct application design and tuning of the Session Manager in conjunction with
use of a highly available database are the keys to minimizing disruptions in service, in the event of a database
outage.
6.3.1 Expected Behavior - Servlet Service Method
In the event of a failure of the session database, servlet requests are queued waiting for database service to be
restored. As with the rest of the WebSphere runtime, the Session Manager uses the WebSphere datasource
implementation and the StaleConnectionException mechanism. Since the Session Manager is the database
client, and the application is the client to the Session Manager, the application has no visibility of any outage of
the database. The Session Manager handles all database failure and subsequent recovery transparently to the
application. Once the database service is restored the database connection pool is destroyed and recreated by
the runtime, and all queued requests are serviced. The only visibility to the problem is the messages in the
administration console, depicted below, and the web browser “hanging”.
56
Figure 6.4: Get Connection Failure displayed in Administrative Console
6.3.2 Expected Behavior - Manual Update
Another option for session persistence updates in WebSphere involves the use of an IBM extension to the
HTTPSession API , know as manual update. Manual update allows the application to decide when a session
should be stored persistently. With manual update , the Session Manager only sends changes to the persistent
data store if the application explicitly requests a save of the session information
Manual update requires that an application developer use the IBMSession class for managing sessions. When
the application invokes the sync() method, the Session Manager writes the modified session data and last
access time to the persistent session database. The session data that is written out to the database is controlled
by the Write Contents option selected.
If the servlet or JSP terminates without invoking the sync() method, the Session Manager saves the contents of
the session object into the session cache (if caching is enabled), but does not update the modified session data
in the session database. The Session Manager will update just the last access time in the database
asynchronously at a later time.
The expected behavior and recovery with manual update are the same as with the default update that occurs at
the end of the servlet service method.
57
6.3.3 Expected Behavior - Time Based Update
With the time based update, all updates of the HTTP Session object occur in memory, as is the case with the
default and manual update, but the database updates are deferred until the end of the specified time interval.
By deferring the updates to the database, the probability that any disruption in database service will disrupt
servlet requests is minimized. For example, the default interval for time based updates is 300 seconds, thus a
database failure that occurred immediately after the last update interval would not impact servlet requests for
300 seconds. If database service was restored during this interval the outage would not even be noticed, aside
from a warning message in the administration console as depicted below.
Figure 6.5: StaleConnectionException during Time Based Update
In the case where a database update was to take place during an interruption in database service, then the
Session Manager functions in the same manner as with updates at the end of the servlet service method and
manual updates, by deferring servlet requests until the database service is restored.
The amount of time appropriate for a given environment can only be determined through testing to determine
the amount of time required for database failover and recovery. While the default time interval of 300 seconds
(5 minutes) should prove more than adequate for database failover and recovery with most cases, the time
interval can be fine tuned for your environment by using the manual persistence configuration dialog in the
session manager.
58
Figure 6.6: Persistence Tuning Configuration Property Sheet
While specification of a longer time interval should lower the probability of an update coinciding with a
database failure, interval tuning not only needs to consider the time required for database failover, but also the
amount of traffic on the website in that interval, as well as the resulting session objects cached by the session
manager and the attendant JVM memory impact.
Despite the robust Session Manager implementation provided by WebSphere, there remains windows of
vulnerability for HTTP session state maintenance/persistence. For example, in the case where time based
update is utilized, if an application server process were to fail between updates, all updates in memory that
had occurred since the last database update would be lost. Thus in cases where the state information being
stored is valuable (“state liability is high”), a recommended alternative to the storage of state information in
HTTP session would be for the application to explicitly store the information in a database via a direct JDBC
call or an entity EJB, rather than relying on the persistence mechanism used by the Session Manager. This
alternative provides for transactional updates of state information
6.4 Administrative Database
The administrative database is a potential single point of failure for many operations involving both the system
administration and application runtime environments. It stores the configuration information about application
servers, and what applications to run on those servers. The configuration information includes lists of servers in
server groups for use by the workload management runtime. The administrative database also stores the
operational information about the state of application servers. The output of serious events are stored in the
administration database. And finally, it serves as the repository for the name server used to support JNDI
service. An outage of the Administrative database can have an effect on all of these services.
59
Many customers make a distinction between the loss of administrative function and loss of application function
during administration database failure. They are willing to tolerate the loss of administrative function, but
unwilling to tolerate the loss of application function. In this section we will describe how to make use of JNDI
caching to ensure application functions are uninterrupted despite failure of the administrative database.
6.4.1 System Administration
While the administrative database is down, administrative functions requiring reads and writes to this database
will fail. These operations include stopping and starting application servers, querying the state of servers, or
changing the application topology through any of the WebSphere administrative interfaces. New administrative
instances, such as Administrative Consoles, XMLConfig instances, and WSCP instances can not be started
while the administrative database is unavailable.
If an application server crashes unexpectedly while the administrative database is available, the administrative
server would automatically restart the server. If the administrative database is unavailable, this application
server can not be restarted because the administrative server can not retrieve the appropriate information from
the administrative database.
When the administrative database is functioning normally, WebSphere events are logged to both an activity.log
file on the local node and a table in the database. This database logging enables any administrative server in
the WebSphere domain to view all serious events, rather than just those on the local node. When the
administrative database is unavailable, events are logged only to the activity logs on the local nodes. The
following error message will be displayed in the administrative console:
60
Figure 6.7: Attempt to log serious event during database outage
6.4.2 Application Runtime Environment - JNDI caching
We often discuss application servers in terms of “started” and “stopped.” However, the started phase can
actually be divided into two separate sub-states, initialization and steady state. During initialization, the
application is in the process of acquiring the resources required for smooth execution, including deployment
descriptors for servlets and EJBs, workload management server group information, data sources, and EJB
homes. At steady state all of the resources are cached in memory, and the task of the application is to process
user requests.
As mentioned in the previous section, when the administrative database is unavailable, new application servers
can not be started. Also, application servers in the initialization phase will not load successfully. However, if
the JNDI cache is properly configured in both the application server and any Java clients in the environment,
applications in steady state will continue to function properly.
The JNDI service in WebSphere runs inside of the WebSphere Administrative Server, and is backed by the
administrative database. If this database fails, the JNDI service is unable to return lookup responses. In
WebSphere V3.5.2 a client-side JNDI cache was added as a performance enhancement. As JNDI lookups
were performed, the results were stored in a cache within the JNDI client process. The next time the process
attempted to perform this same lookup, the response is retrieved from the cache rather than the JNDI service.
61
Primarily designed as a performance improvement, this cache did provide some measure of toleration for
failure of the administration database.
In WebSphere 4.0.2, configuration options for this client-side JNDI cache are provided to enable it as a
mechanism for toleration of administrative database failure. The JNDI cache can be configured to perform in
one of three ways. All these options allow the application to initialize the JNDI cache during initialization,
allowing the application to function should the administrative database fail after the application is started.
The first configuration pre-loads all EJB References used by an application when the application is started. If
at any point the JNDI server becomes unavailable, these references are already cached, and the application
will continue to function. Note explicit steps to cache resources looked up through the resource references via
"java:comp" namespace is unnecessary. These resources reside in the same process as the application client
or application server. The parameters used to create these resources are loaded into memory as part of
application server or client initialization. The actual lookup is performed locally in process.
The second configuration pre-loads the entire JNDI name tree at application startup. This may be useful for
applications which do not use the EJB Reference mechanism provided by J2EE, but may adversely affect the
performance and memory utilization of the application if the JNDI name tree is particularly large.
The third option allows an application to pre-load a specific portion of the JNDI name tree by specifying an
XML file which contains additional instructions.
6.4.2.1 -XML/DTD syntax
It is possible to create an XML file stating which portions of the JNDI name space you would like a client
process to pre-load. The DTD for this XML file looks like:
<!ELEMENT JNDICachePreload (Provider)* >
<!ELEMENT Provider (Entry | Subtree)* >
<!ELEMENT Entry EMPTY>
<!ELEMENT Subtree EMPTY>
<!ATTLIST Provider
INITIAL_CONTEXT_FACTORY CDATA #IMPLIED
PROVIDER_URL CDATA #IMPLIED>
<!ATTLIST Entry Name CDATA #REQUIRED >
<!ATTLIST Subtree Name CDATA #IMPLIED >
Here is a sample XML file illustrating what is to be pre-loaded.
<?xml version="1.0"?>
<JNDICachePreload>
<!-- This uses default provider and preloads entry a/b/c and subtree rooted at d/e/f -->
<Provider>
<Entry Name="a/b/c"/>
<Subtree Name="d/e/f"/>
</Provider>
<!-- This uses a factory and a provider URL to preload entry X and subtree at Y -->
62
<Provider INITIAL_CONTEXT_FACTORY= "com.ibm.websphere.naming.WsnInitialContextFactory”,
PROVIDER_URL= "iiop://myserver1.domain.com:900");>
<Entry Name="X"/>
<Subtree Name="Y"/>
</Provider>
<!-- This preloads entire tree at a different provider -->
<Provider PROVIDER_URL="iiop://myserver2.domain.com:900"">
<Subtree/>
</Provider>
</JNDICachePreload>
6.4.2.2 Configuring the JNDI cache for an application server
The application deployer can install one of two WebSphere CustomServices to configure the JNDI cache for
an application server. These will help ensure that an application server reaches steady state during server
initialization.
The first CustomService, com.ibm.websphere.naming.CustomServicePreloadCacheJavaURL, is used to pre load
all the EJB references used by an application. The second CustomService,
com.ibm.websphere.naming.CustomServicePreloadCacheNameTree, is used to pre-load the entire name tree.
Both of these CustomServices can take an additional property as input. The input property
com.ibm.websphere.naming.JNDIPreload.configFile=<xmlfile> specifies the location of an XML file that contains
additional JNDI entries to be pre loaded. See section 6.6.15 in the WebSphere 4.0 InfoCenter for more
information on configuring custom services.
6.4.2.3 Configuring the JNDI Cache for an Application Client
To pre load JNDI cache for an application client, use the following command line parameters:
# pre load EJB references
-CCPreloadCacheJavaURL=true
# pre load entire name tree
-CCPreloadCacheNameTree=true
# config file with additional entries to load
-CCJNDIPreloadconfigFile=xmlfile
Or place them in the application client property file:
PreloadCacheJavaURL=true
PreloadCacheNameTree=true
JNDIPreloadconfigFile=xmlfile
6.4.2.4 Preloading JNDI Cache for Thin Client
63
Thin clients are non-managed clients not running within a client container. For these clients, it is the application
writer’s responsibility to write code to initialize and cache the JNDI lookups at the beginning of the program’s
initialization.
6.4.2.5 Operating Restrictions
In order to enable client-side caching, applications can not bind to JNDI at runtime. This binding would cause
the caches of already running clients to become out of synch with the JNDI server. This should not be a
problem for the majority of applications, since the J2EE specification does not require the server runtime to
offer this capability. Applications have the option of pre-binding entries into the namespace via a separate
utility. During application initialization, these entries may be cached into memory. Finally, steps must be taken
to preload the JNDI cache, as previously described in this section.
Short running application clients that are constantly being restarted will not run well during a failover situation,
since they have a better chance of being at initialization state when the administrative database goes down.
The use of highly available administration database can alleviate this problem by shortening the amount of
downtime.
As mentioned in section 6.4.1, system administration operations, including application server starts, will be
unavailable during an administrative database failure. JNDI caching does not provide any relief for this
restriction.
6.4.2.6: JNDI Cache Size Considerations
The space used by the JNDI client cache is about 1.5K per entry. The calculation of actual memory
requirement is complicated by the depth of the compound name, where each component of the name also
takes up a cache entry. For example, an EJB home bound as “a/b/MyHome” requires three cache entries:
“a”, “b”, and “MyHome”. An EJB home bound as “a/b/MyOtherHome” requires only one additional entry
for “MyOtherHome”, since both “a” and “b” are already cached. Therefore, the actual space required for
JNDI client cache is determined by the structure of the name tree. If each process tries to access 1000
EJBs, the space requirement lower bound is 1000*1.5K = 1.5Meg per process. On the other hand,
assuming that the depth of the compound name is D, and the names never overlap, the space requirement
upper bound would be D*1000*1.5K. If the depth is 10, the upper bound is 15Megs. The actual space
requirement will be somewhere in-between the lower and upper bound.
6.4.3 HA Administrative Database
An HA administrative database can be hosted on the same nodes hosting the HA application database(s). In
fact it is recommended that administration database deployment occur in this fashion. When the administration
database is co-located with HA application database(s), database administration and recovery are facilitated
since processes and procedures are already in place for the server. Further, it is generally easier to add the
database into the existing HA monitoring configuration providing the automatic database error detection and
recovery required for a production environment.
64
However, even with a highly available database, there will still be a period of time during the failover process in
which the administrative database is unreachable. During this failover period, services such as systems
management, serious event logging, and JNDI service will be unavailable. This means that even in a situation
with a highly available database configuration, the JNDI caching options outlined in the above section will be
necessary to ensure uninterrupted application function.
Highly available database options will be discussed in more detail in Chapter 7.
6.4.4 Administrative Database Recovery
After the failover of the administrative database, clients of the services have to reconnect to resume normal
operations. The WebSphere runtime services, such as the WLM runtime and the logging service are already
coded to tolerate the failure of the administrative database. Clients to WebSphere runtime, such as callers to
the JNDI service, have to be coded to retry.
The first access to the administrative server that causes it to access the administrative database, such as running
administration client, XMLConfig, or wscp, may result in a an exception due to old stale connections still
cached in the server. The adminserver will handle recovery from this StaleConnectionException, and the next
administrative operation will succeed.
An error message will be displayed in the administration console, much the same as that for an application
exception. However the message detail will differ as shown below.
65
Figure 6.8: StaleConnectionException in the Administrative Console
If an application server or administrative server crashed unexpectedly during a database outage, the normal
nanny procedures would be unable to restart this server. It will be necessary to manually restart these
application servers when the database comes back online. In a larger installation with many nodes, it is
inconvenient to manually check the health of the administrative and application servers. Therefore, a wscp
script has been provided in WebSphere 4.0.2 to check these servers, and optionally restart application
servers. It is not currently possible to restart administrative servers from wscp. To run the script, first change
to the following directory:
%WAS_HOME%\AppServer\bin
$WAS_HOME/AppServer/bin
on NT
on AIX
Then invoke this wscp to check for failed admin and application servers:
wscp -f restartServer.tcl -c restartServer
After restarting any failed admin server, use this command to restart failed application servers:
66
wscp -f restartServer.tcl -c restartServer 1
67
Chapter 7 - Integrated Hardware Clustering for
Database High Availability
As discussed in previous chapters, WebSphere relies heavily on database functionality for its operations. In
addition to customer application data, databases are also used for the WebSphere Administrative Repository,
as storage for persistent HTTP sessions, and as storage for some LDAP servers used with WebSphere
security. Protecting data integrity and enhancing availability for the entire WebSphere processing environment
can be achieved by utilizing hardware clustering solutions to provide high availability for these databases.
A hardware cluster is a group of independent computers that work together to run a common set of
applications, in this case database applications, and provide the image of a single system to the client and
application. The computers are physically connected by cables and programmatically connected by cluster
software. These connections allow computers to use failover and load balancing, which is not possible with a
standalone computer.
To the WebSphere Application Server, connecting to a clustered remote database is no different than
connecting to a non-clustered remote database. As the active node in the cluster moves from node to node,
WebSphere will detect that the connections it is holding in its connection pool are no longer valid, flush the
pool and throw the appropriate exception (usually a StaleConnectionException) to the running application to
indicate that the application should establish new connections.
WebSphere Application Server 4.0.x supports many different databases technologies, this chapter will discuss
the high availability database solutions tested in the WebSphere labs. WebSphere components should perform
properly with these solutions assuming proper availability is maintained by the HA solution and appropriate
coding practices, discussed in Chapter 6, are followed.
7.1 HACMP and HACMP/ES on IBM AIX
7.1.1 Introduction and Considerations
IBM High Availability Cluster Multiprocessing (HACMP) and High Availability Cluster MultiProcessing
Enhanced Scalability (HACMP/ES) provides the AIX platform with a high-availability solution for mission
critical applications. HACMP allows the creation of clusters of RS/6000 servers to provide high availability of
databases and applications in spite of a hardware, software, or network failure. HACMP/ES includes all of the
same functionality as HACMP, but enables larger clusters of machines and more granular event control.
The unit of failover in HACMP or HACMP/ES is a resource group. A resource group contains all the
processes and resources needed to deliver a highly available service and ideally should contain only those
processes and resources. HACMP for AIX software supports a maximum of up to 20 resource groups per
cluster.
68
As with MC/ServiceGuard, an HACMP cluster requires two network connections for a highly available
heartbeat mechanism. A public network connects multiple nodes and allows clients to access these nodes
while a private network connection is allows point-to-point communications between two or more nodes.
Each node is typically configured with at least two network interfaces for each network to which it connects: a
service interface that handles cluster traffic, and one or more standby interfaces to allow recovery from a
network adapter failure. This recovery by a standby adapter is typically faster than recovery by another node,
and thus provide less down time.
The AIX IP address takeover facility is utilized to facilitate node failover. If one node should fail, a takeover
node acquires the failed node’s service address on its standby adapter, making failure transparent to clients
using that service IP address. To enable IP address takeover, a boot adapter IP address must be assigned to
the service adapter on each cluster node. Nodes use this boot address after a system reboot and before the
HACMP for AIX software is started. When the HACMP for AIX software is started on a node, the node’s
service adapter is reconfigured to use the service address instead of the boot address.
To provide data redundancy, a RAID disk array can be shared among the nodes in the cluster.
When a failover occurs, HACMP invokes a stop script to stop the resource group and a start script to start
the resource group. Samples of these scripts are included in Appendix D, but these scripts can be customized
to match the runtime environment.
7.1.2 Expected Reactions to Failures
If a failure occurs, the quickest way to recover is to failover from the service adapter to a standby adapter. In
many cases, for example a node power failure, this is not an option. In this case, the resource group will fail
over to another node in the cluster.
The node on which the resource group is recovered depends on the resource group configuration. There are
three types of resource groups supported:
Ÿ
Ÿ
Ÿ
Cascading resource groups - A resource may be taken over by one or more nodes in a resource chain
according to the takeover priority assigned to each node. The available node within the cluster with the
highest priority will own the resource. If this node fails, the node with the next highest priority will own the
resource. You can also choose to allow the highest priority node recover its resources when it is
reintegrated into the cluster, or set a flag requiring the resource to remain on the active lower-priority node.
Rotating resource groups - A resource is associated with a group of nodes and rotates among these
nodes. When one node fails, the next available node will acquire the resource group.
Concurrent access resource groups - A resource can be managed by the HACMP for AIX cluster Lock
Manager and may be simultaneously shared among multiple applications residing on different nodes.
Cascading resources provide the best performance since cascading resources ensure that an application is
owned by a particular node whenever that node is active in the cluster. This ownership allows the node to
cache data the application uses frequently. Rotating resources may minimize the downtime for failover.
69
When this failover occurs, the following steps are run on the new active node:
Ÿ
Ÿ
Ÿ
Ÿ
Ÿ
run the stop script
release the disk volume groups and other shared resources held by the primary node.
take over the service IP address
mount the shared disk array and take over any other resources necessary
run the start script
10.5.10.x
Standby 10.5.10.4
Standby 10.5.10.3
Primary
Admin DB
LDAP DB
Session DB
App DB
HACMP or
HACMP/ES
Oracle,
DB2, LDAP
Service 10.5.12.8
Boot 10.5.12.1
Hot Standby
HACMP or
HACMP/ES
Admin DB
LDAP DB
mirror
Session DB
mirror
App DB
mirror
Oracle,
DB2, LDAP
Dedicated RS232 heartbeat line
Service 10.5.12.9
Boot 10.5.12.2
10.5.12.x
Figure 7.1: HACMP Cluster Configuration for the WebSphere System
70
10.5.10.x
Failover
Standby 10.5.10.4
Standby 10.5.10.3
Primary
Admin DB
LDAP DB
Session DB
App DB
HACMP or
HACMP/ES
Oracle,
DB2, LDAP
Service 10.5.12.8
Boot 10.5.12.1
Hot Standby
HACMP or
HACMP/ES
Admin DB
LDAP DB
mirror
Session DB
mirror
App DB
mirror
Oracle,
DB2, LDAP
Dedicated RS232 heartbeat line
Service 10.5.12.9
Boot 10.5.12.2
10.5.12.x
Figure 7.2 HACMP Cluster for WebSphere after the primary node fails
7.1.3 WebSphere HACMP configuration
The lab environment used to test this configuration consisted of two IBM RS/6000 machines running AIX
4.3.3. These machines were installed with the HACMP software. We tested HACMP 4.3.1, HACMP 4.4, or
HACMP/ES 4.4 ptf 3. Both DB2 7.2.1 and Oracle 8.1.7 were installed. The machines shared an IBM 7133
Serial Storage Architecture Disk Subsystem. The public network connection was supplied by a 100 MB
Ethernet LAN, while the private connection was supplied by a dedicated RS232 connection.
WebSphere components were also configured in a highly available topology, as previously described in this
document. WebSphere resources requiring database access were configured to communicate with the service
IP address. Applications requiring database connections were programmed in accordance with the
recommendations in Chapter 6.
71
Test Environment
Clients
Clients
HTTP
Server
FW
App Server
FW
App Server
Admin
Server
Database LDAP
Admin DB
HACMP 4.4
ND
Sess. DB
Clients
ND
Clients
FW
App Server
HTTP
Server
FW
App Server
Admin
Server
Database
LDAP
HACMP 4.4`
App DB
LDAP DB
Clients
App
Figure 7.3: Test environment for integrated WebSphere and HACMP HA system
7.1.4 Tuning heartbeat and cluster parameters
HACMP 4.4 provides control over several tuning parameters that affect the cluster’s performance. Setting
these tuning parameters correctly to ensure throughput and adjusting the HACMP failure detection rate can
help avoid “failures” caused by heavy network traffic.
Ÿ
Adjusting high and low watermarks for I/O pacing. The default value is 33 -24
Ÿ
Adjusting syncd frequency rate. The default value is 10.
Ÿ
Adjusting HACMP failure detection rate. There are two parameters that control HACMP failure
detection rate.
w HACMP cycles to failure: the number of heartbeats that must be missed before detecting a failure.
w HACMP heartbeat rate: the number of microseconds between heartbeats
For examples, heartbeat rate=1 seconds, cycles =10, the failure detection rate would be 10 seconds.
Faster heartbeat rates may lead to false failure detection, particularly on busy networks. Please note that
the new values will become active the next time cluster services are started.
Ÿ
AIX deadman switch timeout. If HACMP for AIX cannot get enough CPU resource to send heartbeats
on IP and serial networks, other nodes in the cluster will assume the node has failed, and initiate takeover
of node’s resources. In order to ensure a lean takeover, the deadman switch crashes the busy node if it is
not reset within a given time period.
72
7.2 MC/Serviceguard on the HP-Unix Operating System
7.2.1 Introduction and Considerations
Hewlett Packard Multi-Computer/ServiceGuard (MC/ServiceGuard) is a high-availability solution available for
HP-UX systems. This product allows the creation of clusters of HP 9000 series computers that provide high
availability for databases and applications in spite of a hardware, software, or network failure.
MC/ServiceGuard ensures availability of units called packages; a collection of services, disk volumes, and IP
addresses. When a cluster starts, a package starts up on its primary node. The MC/ServiceGuard package
coordinator component decides when and where to run, halt, or move packages. User-defined control scripts
are used to react to changes in the monitored resources.
A heartbeat mechanism is used to monitor node and database services and their dependencies. If the heartbeat
mechanism detects a failure, recovery is started automatically. To provide a highly available heartbeat
mechanism, two connections must be available between the MC/ServiceGuard nodes. This can be provided
by dual network connections, or, in a two node configuration, a single network connection and a dedicated
serial heartbeat. This heartbeat redundancy will prevent a false diagnosis of an active node failure.
Network failures are handled by the network manager component. Each active network interface on a node is
assigned a static IP address. A relocatable IP address, also known as a floating IP address or a package IP
address, is assigned to each of the packages. This provides transparent failover to the clients by providing a
single IP address to connect to, no matter which node in the cluster is hosting the service. When a package
starts up, the cmmodnet command in the package control script assigns this relocatable IP address to the
primary LAN interface card in the primary node. Within the same node, both static and relocatable IP
addresses will switch to a standby interface in the event of a LAN card failure. Only the relocatable IP address
can be taken over by an adoptive node if control of the package is transferred. At regular intervals, the
network manager polls all the network interface cards specified in the cluster configuration file. The polling
interface sends LAN packets to all other interfaces in the node that are on the same bridged net and receives
packets back from them. If a network failure is detected, the network manager has two options:
Ÿ
Ÿ
Move the package to a backup network interface on the same node. This is often referred to as a local
switch. TCP connections to the relocatable IP address are not lost (except for IEEE 802.3 that does not
have the rearp function).
Move the package to another node in the cluster. This is often referred to as a remote switch. This switch
causes all TCP connections to be lost. If a remote switch of a WebSphere database package is initiated,
connections held in the WebSphere connection pools will be lost. These connections will be marked stale
and will need to be recovered as described in Chapter 6.
Data redundancy is also an important part of this failover mechanism. There are two options available to
provide data redundancy with MC/ServiceGuard:
Ÿ
Disk mirroring with MirrorDisk/UX - MirrorDisk/UX is a software package for the HP-Unix Operating
System which provides the capability to create up to three copies of your data on different disks. If one
73
Ÿ
disk should fail, MirrorDisk/UX will automatically keep the data available by accessing the other mirror.
This access is transparent to the applications utilizing the data. To protect against SCSI bus failures, each
copy of the data must be accessed by a separate SCSI bus.
Redundant Array of Independent Disks (RAID) levels and Physical Volume (PV) links - An array of
redundant disks and redundant SCSI interfaces protect against single point of failure in the I/O channel.
PV links are used to configure the redundant SCSI interfaces.
You can monitor disks through the HP Event Monitoring Service, a system monitoring framework for the HP
environment.
7.2.2 Expected Reactions to Failures
If a package needs to be moved to another node, the general MC/ServiceGuard process involves the
following steps:
Ÿ
Ÿ
Ÿ
Ÿ
Ÿ
Ÿ
Ÿ
Stop services and release resources in the failed node
Unmount file systems
Deactivate volume groups
Activate volume groups in the standby node
Mount file systems
Assign Package IP addresses to the LAN card on the standby node
Execute start up of services and get resources needed
During the time this failover process is occurring, the package is unavailable for client connections. If the
package contains one or more of the WebSphere databases, errors may occur in this component. Once the
failover is complete, WebSphere components and applications should successfully reconnect, assuming proper
programming practices as outlined in Chapter 6 are followed.
Primary
root
Admin DB
LDAP DB
Session DB
mirror
App DB
root
MC/ServiceGuard
mirror
Oracle
LDAP
Hot Standby
MC/ServiceGuard
Admin DB
LDAP DB
mirror
Session DB
App DB
mirror
Oracle
LDAP
mirror
Dedicated heartbeat line
10.0.0.x
Figure 7.4: MC/ServiceGuard cluster database configuration
74
Failover
Primary
root
Admin DB
LDAP DB
Session DB
mirror
App DB
root
MC/ServiceGuard
mirror
Oracle
LDAP
Hot Standby
MC/ServiceGuard
Admin DB
LDAP DB
mirror
Session DB
App DB
mirror
Oracle
LDAP
mirror
Dedicated heartbeat line
10.0.0.x
Figure 7.5: MC/ServiceGuard cluster configuration after failover
7.2.3 WebSphere MC/ServiceGuard configuration
The lab environment used to test this configuration consisted of two HP 9000 k-class machines running
HP-UX 11.0. These machines were installed with MC/ServiceGuard A11.12. Both DB2 7.2.1 and Oracle
8.1.7 were installed as well. The machines shared an AutoRAID disk array, where the DB2 or Oracle
database instance was created. We used Fast/Wide SCSI interfaces to connect two nodes to the disk array.
The redundant heartbeat mechanism was set up on two Ethernet LANs, one a public LAN connected to the
rest of the machines in our configuration, the other a private LAN for heartbeat only.
WebSphere components were also configured in a highly available topology, as previously described in this
document. WebSphere resource requiring database access were configured to communicate with the
relocatable IP address. Applications requiring database connections were programmed in accordance with the
recommendations in Chapter 6.
75
Test Environment
Clients
Clients
HTTP
Server
FW
App Server
FW
App Server
Admin
Server
Database LDAP
MC
Serviceguard
ND
Clients
ND
Clients
FW
App Server
HTTP
Server
FW
App Server
Admin
Server
Database
LDAP
MC
Serviceguard
Admin DB
Sess. DB
App DB
LDAP DB
Clients
App
Figure 7.6: Test environment for integrated WebSphere and MC/Serviceguard HA system
7.2.4 Tuning heartbeat and cluster parameters
Sending and receiving heartbeat messages among the nodes in the cluster is a key part of the cluster
management technique. If a cluster node does not receive heartbeat messages from the other node within the
prescribed time, a cluster reformation is initiated. At the end of the reformation, if a new set of nodes form a
cluster, that information is passed to the package coordinator. Packages which were running on nodes that are
no longer in the new cluster are transferred to their adoptive nodes.
There are several MC/ServiceGuard parameters you can tune for performance and reliability:
Ÿ
Heartbeat interval is the normal interval between the transmission of heartbeat messages from one node
to the other in the cluster. The default value is 1 second, with a maximum value of 30 seconds.
Ÿ
Node timeout is the time after which a node may decide that the other node has become unavailable and
initiate cluster reformation. The default value is 2 seconds, with a minimum of 2* (heartbeat interval) and a
maximum of 60 seconds. Small values of node time out and heartbeat interval may increase the potential
for spurious cluster reformation due to momentary system hangs or network load spikes.
Ÿ
Network polling interval is the frequency at which the networks configured for MC/Serviceguard are
checked. The default value is 2 seconds. Changing this value can affect how quickly a network is detected,
ranging from 1 to 30 seconds.
Ÿ
Choosing switching and failover behavior: Switching the IP address from a failed LAN card to a
standby LAN card on the same physical subnet may take place if Automatic Switching is set to Enabled in
SAM. You can define failover behavior and whether it will fallback automatically as soon as primary node
is available.
Ÿ
Resource polling interval is the frequency of monitoring a configured package resource. Default value is
60 seconds, with the minimum value of 1 second.
76
7.3 Microsoft Clustered SQL Server on Windows 2000
7.3.1 Introduction and Considerations
In the Windows 2000 Advanced Server and Datacenter Server operating systems, Microsoft introduces two
clustering technologies that can be used independently or in combination, providing organizations with a
complete set of clustered solutions that can be selected based on the requirements of a given application or
service. Windows clustering technologies include:
Ÿ
Ÿ
Cluster service. This service is intended primarily to provide failover support for applications such as
databases, messaging systems, and file/print services. Cluster service supports 2-node failover clusters in
Windows 2000 Advanced Server and 4-node clusters in Datacenter Server. Cluster service is ideal for
ensuring the availability of critical line-of-business and other back-end systems, such as Microsoft
Exchange Server or a Microsoft SQL Server (TM) 7.0 database acting as a data store for an e-commerce
web site.
Network Load Balancing (NLB). This service load balances incoming IP (Internet Protocol) traffic
across clusters of up to 32 nodes. Network Load Balancing enhances both the availability and scalability
of Internet server-based programs such as web servers, streaming media servers, and Terminal Services.
By acting as the load balancing infrastructure and providing control information to management applications
built on top of Windows Management Instrumentation (WMI), Network Load Balancing can seamlessly
integrate into existing web server farm infrastructures. Network Load Balancing will also serve as an ideal
load balancing architecture for use with the Microsoft release of the upcoming Application Center in
distributed web farm environments.
See Introducing Windows 2000 Clustering Technologies at
http://www.microsoft.com/windows2000/techinfo/howitworks/cluster/introcluster.asp for more information.
The failover mechanism in the Windows 2000 active/passive clustering works by assigning a virtual hostname /
IP address to the active node. This is the hostname exposed to all external clients accessing the clustered
resource. When a resource on the active node fails, the cluster moves to the passive node, making it the active
node and starting all the necessary services. Additionally, the virtual hostname / IP address is moved from the
first node to the newly activated node and handles all new requests from the clients. While this transition is
taking place, the database will not be available and requests will fail. However, once the transition is complete,
the only indication to the client that something has failed is that the connections to the cluster are no longer valid
and must be reestablished.
7.3.2 Expected Reactions to Failures
The Microsoft cluster can fail in several ways.
Ÿ
Ÿ
Manual Push to Passive Node - Within the cluster administrator, you can right-click on a group and
click "Move Group" to move it to the passive node.
Clean Shutdown of Active Node. Without manually moving any of the groups, go to Start->Shutdown
and power down the Active Node.
77
Ÿ
Ÿ
Unexpected power failure on Active Node. Physically pull the power cable from the Active Node.
Public network failure on the Active Node. Only pull the public network cable from the Active Node.
When these failures occur, the cluster service should recognize that one (or more) of the resources failed on
the active node and transition all of the components from the failing node to the alternate node. At this point, all
the connections to WebSphere are broken, so on the next request from WebSphere, stale connections would
be detected and a StaleConnectionException would be thrown to the WebSphere Application (as described in
Chapter 6). After the transition to the new active node has completed WebSphere will reestablish connections
to the database. Applications programmed according to the guidelines in Chapter 6 would also reconnect to
the database.
If both the private and public networks on the active node fail, the cluster service will not transition the
components to the alternate node. When both networks go down and all communication is lost, both nodes
think the other node is down and try to take control of the shared disk. However, the disk is already locked
by the active node, so the standby node’s takeover request will fail. Meanwhile, the active node will attempt to
move the cluster to another node. However, since all of its network connections were lost, the active node will
see itself as the only node in the cluster.
7.3.3 WebSphere Microsoft Cluster Server Configuration
The lab environment used to test this configuration consisted of two IBM Netfinity 7000 M10 machines
running Microsoft Windows 2000 and Microsoft SQL Server 2000 Enterprise Edition.
Hardware
Database Server Node
Shared Disk
Software
IBM Netfinity 7000 M10
• Microsoft Windows 2000
Advanced Server with
• 4 Internal 16 GB SCSI
Service Pack 2 Applied
Hard drives
(includes clustering
• Adaptec AIC-7895 SCSI
software)
controller
• Microsoft SQL Server 2000
• 2 GB RAM
Enterprise Edition
• 4x500 MHz Pentium III
OR
Xeon processors
IBM DB2 7.2
Enterprise-Extended Edition
IBM EXP15
• 10 Drive Array
WebSphere Server
• 2 SCSI interfaces
IBM M Pro
Windows NT 4.0 SP 6a
• 1x933 MHz Processor
• 512 MB RAM
78
Notes:
Ÿ Even though the WebSphere server was tested on an NT platform, the same functionality occurs on any of
the distributed WebSphere platforms.
Ÿ The nodes of the cluster were identical boxes with identical software installed on them, so this data will
only be presented once even though two boxes were used.
Active Directory
DNS Server
(10.0.0.5)
(10.0.0.6 - DNS IP)
WebSphere
4.0x
Domain
Public (10.0.0.x)
Database Server
Cluster Node 1
(10.0.0.61 - Public)
(10.1.0.61 - Private)
Shared
Disk
Database Server
Cluster Node 2
(10.0.0.62 - Public)
(10.1.0.62 - Private)
Private (10.1.0.x)
Figure 7.7: Microsoft Cluster Configuration
7.3.4 Tuning heartbeat and cluster parameters
To ensure proper performance of the heartbeat communications, it is recommended that you disable any
unnecessary network protocols, and ensure proper TCP/IP configuration. It is also recommended that the
“Media Sense” configuration option be disabled in Windows 2000. For more information, see the Microsoft
Support Article “Recommended Private "Heartbeat" Configuration on a Cluster Server (Q258750)” at
http://support.microsoft.com/default.aspx?scid=kb;en-us;Q258750.
79
Chapter 8 - Failover for Other Components
LDAP
HTTP
client
LDAP Clients
( WebSphere
Application and
Administration
Servers )
Internet
Firewall
Figure 8.1: WebSphere Topology with Firewalls
8.1 Firewalls
In a web application environment a firewall is a basic component of the infrastructure, or should be. In fact,
most environments have multiple firewalls. As noted in the introduction to this paper, the outage of a firewall
can result in catastrophic outage much like losing the network itself. High availability for a firewall is requisite
for an e-business environment. If a firewall is down, customers are unable to access applications on your
website, and more importantly, in some configurations your web infrastructure is vulnerable to attacks from
hackers during any outage.
High availability for firewalls can be achieved in a number of different ways, but the two most common involve
the use of either a hardware clustering implementation such as HACMP, or a load balancing cluster using a
product such as the WebSphere Edge Server Network Dispatcher.
The first approach using hardware clustering is depicted below. Two separate servers are configured to run the
firewall software as well as the hardware clustering software. Under normal operation all traffic is routed
through the first firewall.
80
Normal
Operation
HTTP
client
Firewall
Machine 1
Internet
HTTP Servers
Firewall
Machine 2
Figure 8.2: Normal HACMP Firewall Operations
When an outage is detected the hardware clustering software is configured to effect an IP service address
takeover. Migrating the service IP address to the backup node allows the client to continue to use the same IP
address. The firewall workstation that has the service IP addresses is the active node from a hardware cluster
perspective. It will get all the IP traffic. The result of this is depicted below.
Failover
Operation
HTTP
client
IP Service Address
Takeover
Firewall
Machine 1
Internet
HTTP Servers
Firewall
Machine 2
Figure 8.3: HACMP Firewall Failover Operation
The second approach using load balancing clustering is depicted below. Two separate servers are configured
to run the firewall software as well as the load balancing clustering software. Under normal operation all traffic
is routed through the first firewall.
81
HTTP
client
Firewall
Machine 1
Internet
HTTP Servers
eND
Heartbeat
Firewall
Machine 2
Figure 8.4: Normal eND Firewall Operations
Failover in this configuration occurs in much the same manner as with hardware clustering. In this scenario the
cluster IP addresses are configured as an alias on the active firewall. In the case of a takeover, this alias is
deleted (or replaced by aliases to the loopback device) and moved to the standby server. The only information
exchange required between the two firewall servers is the heartbeat of ND. An advantage to using load
balancing software for this purpose is that not only can traffic be routed around a firewall that is out of service,
but the software can also be configured to distribute load between the firewalls, improving performance and
throughput, as depicted below.
Firewall
Machine 1
HTTP Servers
HTTP
client
Internet
eND
Heartbeat
Firewall
Machine 2
Figure 8.5: Load Balanced eND Firewall Operations
82
The major difference between these two approaches is that hardware clustering software can only be used to
provide high availability, while load balancing software such as WebSphere Edge Server (WES) Network
Dispatcher can provide high availability and load balancing. Arguably configuration of hardware clustering is a
bit more complex than WES Network Dispatcher. The downside to WES is that it has no concept of rotating
resources, as a result in the case of failure for the primary firewall, there needs to be a second takeover once
service is restored, since the primary WES machine has to be the active one.
In the end, the choice of a high availability mechanism for a firewall isn’t as important as having one.
8.2 LDAP Server
Another component that is sometimes overlooked when considering HA is the LDAP server. Much like the
database server, the LDAP server represents a SPOF unless some manner of replication or clustering is
provided. LDAP Servers provide two mechanisms for scaling and availability; Replication and Referral.
Replication
Replication involves the designation of one server as the master server and one or more replicas. The master is
used for all updates and then propagates changes to the replicas.
Replicas can serve requests to:
• Bind
• Unbind
• Search
• Compare.
Any requests for
Ÿ Add
• Delete
• Modify
• ModifyRDN/modifyDN
that arrive at a replica are directed to the master. This is depicted below.
If the master server fails, replicas can handle requests for read-only operations. However, write operations
will fail. Also, LDAP replication does not provide a mechanism for load distribution and/or failover. This is
accomplished either manually by designating the replica to be the master or via an external load distribution
mechanism. Unless detection of an outage and re-designation of the master can be accomplished in an
automatic and timely fashion, a loss of service will occur.
83
LDAP
Master
LDAP Clients
( WebSphere
Application and
Administration
Servers )
Updates
LDAP
Replica
Figure 8.6: LDAP Master-Replica Configuration
Referral
Though not a supported configuration with WebSphere Application Server, referral is a variation of LDAP
replication that employs multiple masters, typically for an organization or workgroup within a company. Unlike
replication where the master contains all data, with referral there is no single master directory. Instead there are
a series of masters, each of which contains information specific to the organization or workgroup. Requests
that arrive at a server for information that is contained in another server are referred to the appropriate server.
Like replication there is no inherent provision for load distribution or failover. Some multi-master configurations
also have a local replica to enable write HA. The writes are queued locally until the master comes back up
again.
request
o=ibm, c=uk
LDAP Clients
( WebSphere
Application and
Administration
Servers )
referral to
org2
LDAP
Master
(org1)
o=ibm,c=us
referral to
org2
response
o=ibm,c=uk
LDAP
Master
(org2)
Figure 8.7: LDAP Referal Replication
Hardware HA
84
In a HA hardware cluster for LDAP, two separate servers are configured to run the hardware clustering
software. Each server is also configured as an LDAP server, one as the master, the other as a replica. Under
normal operation all traffic is routed through to the master or primary LDAP server.
Normal
Operation
LDAP
Master
LDAP Clients
( WebSphere
Application and
Administration
Servers )
Replication
LDAP
Replica
Figure 8.8: Hardware HA for LDAP - Normal Operation
When an outage is detected the hardware clustering software is configured to effect an IP service address
takeover. The LDAP server that has the service IP addresses is the active node from a hardware cluster
perspective. It will get all the IP traffic. The LDAP server is also now designated as the master. The result of
this is depicted in below.
Failover
Operation
LDAP Clients
( WebSphere
Application and
Administration
Servers )
LDAP
Master
IP Service
Address
Takeover
LDAP
Master
Figure 8.9: Hardware HA for LDAP - Failover Operation
Once service is restored to the primary LDAP server it will be necessary to re-designate the master and the
replica to the original configuration.
85
Load Balancing
DNS RoundRobin
A common mechanism for load distribution of LDAP requests is the use of DNS RoundRobin. With DNS
RoundRobin the server side name server is modified to respond to translation requests with the IP address of
different hosts in a RoundRobin fashion. This results in partitioning client requests among the replicated hosts.
Unfortunately, this approach suffers from a couple of drawbacks that precludes its use with WebSphere.
Both the Java runtime and the OS cache DNS resolution. This use of cache combined with the fact that the
DNS server is typically unaware of an LDAP directory failure means that WebSphere Application Server
could continue to attempt a bind to available and failed LDAP servers. Of course the bind to the failed LDAP
server will fail, and the user will not be authenticated.
Load Balancing Cluster
Failover in this configuration occurs in much the same manner as with hardware clustering. In this scenario the
cluster IP addresses are configured as an alias on the active LDAP. In the case of a takeover, this alias is
deleted (or replaced by aliases to the loopback device) and moved to the standby server. The only information
exchange required between the two LDAP servers is the heartbeat of WebSphere Edge Server Network
Dispatcher (eND). An advantage to using load balancing software for this purpose is that not only can traffic
be routed around a LDAP server that is out of service, but the software can also be configured to distribute
load between the two servers, improving performance and throughput.
Though not depicted another variation of this configuration would be to dedicate servers to running the eND
component. This might be of value in an organization with a very large number of servers simply for ease of
administration. With a smaller number of servers, eND can be co-located with the LDAP servers, as shown in
figure 8.10, saving the expense of additional servers dedicated to the task as is sometimes required with other
load distribution products.
LDAP
Master
LDAP Clients
( WebSphere
Application and
Administration
Servers )
eND
Heartbeat
LDAP
Replica
Load Distribution
(optional)
86
Figure 8.10: eND LDAP Configuration
The advantages and disadvantages of a Load Balancing cluster as compared to a hardware HA cluster are the
same with LDAP as with a firewall.
87
Appendix A
Installing Network Dispatcher (ND) on Windows NT
Introduction
This appendix walks you though installation of ND on an NT machine. The ND forwards requests to web
servers which are installed on NT machines as well. The procedure for other platforms is quite similar to the
one described here. Also, the configuration described here is based on ethernet. If your network uses
token-ring, the procedure would be similar, though there might be some differences.
These instructions (for the most part) are also provided in the Redbook “WebSphere Edge Server: Working
with Web Traffic Express and Network Dispatcher”, SG24-6172-00, July 2001. Download it from
http://www.redbooks.ibm.com before starting your installation, as we will be referencing it in this appendix.
Chapter 3 of this Redbook deals with installation of the ND.
Obtain 3 NT machines:
Machine A: Machine on which you will install ND.
Machine B and Machine C: Machines on which you have your web servers
/
Machine A-------->
IP1, IP2
\
---------------> Machine B
IP3
---------------> Machine C
IP4
You will need 4 IP addresses to setup this configuration. All four IP addresses must be in the same subnet.
This means that the first 3 numbers in all four IP addresses must be same. For example:
1st IP: 25.34.145.A1
This is the primary IP address of machine A
2nd IP:25.34.145.A2
This is the IP that is to be load balanced. It will be used by machine A.
- also known as the Cluster Address or Virtual IP Address
- typed in a web browser’s URL field to initiate the HTTP request
3nd IP:25.34.144.B
IP address of machine B
4nd IP:25.34.145.C
IP address of machine C
Pre-Installation Setup
Machine A setup:
This is the machine on which you will install ND. No special setup is required on this machine at this time.
Before starting to install ND, make sure that you have an additional IP address, the cluster address, available
for this machine. Contact your system administrator to obtain an additional IP address in the same subnet.
88
Note: You will use the 2nd IP address after you have installed ND. At no point will you have to manually add
this 2nd IP address to the Network Properties in your NT system. For now make sure that you have an
additional IP address available to you. We will discuss the usage of this IP address in detail later.
Machines B and C setup:
Before installing ND on machine A, configure the NT machines on which you have the two web servers. These
machines have to be configured to receive and process HTTP requests which were originally meant for cluster
address represented by the ND machine. Therefore we need to install a loopback adapter that’s configured
with the Cluster Address that will be used by machine A.
Install a loopback adapter on each web server machine:
Refer to Chapter 3.2, page 95 of the Redbook “WebSphere Edge Server: Working with Web Traffic Express
and Network Dispatcher”. This section titled “Configuration of the back-end servers” walks you through this
step. No additional hardware or software is needed to install a loopback adapter. After installing the loopback
adapter, your web server machines are equipped to accept requests originally sent for the cluster address.
Delete a route from the routing table:
Continue with the instructions in the Redbook. Follow instructions to delete an extra route
You are now ready to start installing Network Dispatcher on machine A.
Installation of ND
Through the years, ND has been shipped packaged as a part of other IBM products. Currently, ND is
shipped as a part of WebSphere Edge Server. You can download WebSphere Edge Server (WES) from:
http://www-4.ibm.com/software/webservers/edgeserver/
JDK 1.3 is a prerequisite for ND. If you have a WebSphere 4.0 installation on an NT machine, you can use its
JDK to be used with ND. Simply copy the directory C:\WebSphere\AppServer\java from your WebSphere
machine to Machine A on which we plan to install the ND. If you do not have a WebSphere installation, you
can download JDK 1.3 from:
http://www.ibm.com/developerworks/java/jdk/index.html
Follow instructions in the section titled “Installing Network Dispatcher on Windows NT” on page 40 of
the Redbook to install the JDK. After installation update your PATH environment variable to include JDK 1.3.
If you have other JDKs on the system, make sure that JDK 1.3 is before any other JDK in the PATH variable.
After the JDK installation, the Redbook discusses ND installation. Continue with the installation of ND as
described in the Redbook. At the end of the installation you will be required to restart your system - go ahead
and restart it when prompted.
In the next section we will configure ND to accept requests for the Cluster Address 25.34.145.A2
Post installation configuration of ND
On Windows NT, ND starts automatically as a service. Before starting the configuration process, make sure
that its status is “Started” by viewing the Services menu in the Control Panel.
89
Create Keys:
Open a command prompt and run the command:
C:\create ndkeys
The successful completion message is: “Key files have been created successfully.”
This will create a key file required for administration purposes.
Configure the Web Server cluster:
This is where we will configure ND to receive requests on behalf on the Cluster Address 25.34.145.A2
This can be done using either a command line utility called ndcontrol or using a GUI based tool. We will use
the GUI to perform our configuration. The command line tool can be used for the automation of administrative
tasks. The configuration consists of the following steps:
1. Create a cluster for the IP address 25.34.145.A2
2. Create a port number to be load balanced. This port number in our case would be the default HTTP port
80
3. Add servers to which ND will forward the HTTP requests. Here we will add servers B and C.
Refer to section 4.1.2 of the Redbook. The section is titled “Configuration” and is on page 71.
After finishing the configuration, test your basic ND functionality:
1. Start the web servers on machines B and C. Make sure you can access the default page on the web servers
by pointing a web browser directly to their IP addresses. In our case, we used IBM HTTP Server on both the
web server machines and edited the <title> tag in the file C:\IBM HTTP Server\htdocs\index.html on both
machines so that we could identify the web server machine while using ND.
90
2. Open a web browser and type in the URL corresponding to the Cluster Address. If everything is installed
and configured right, you will see an HTML page given by either machine B or machine C. You will be able to
identify the machine by looking at the tittle of the web page. Refresh the page many times till you can see the
HTML pages from both the web server machines.
91
Appendix B - Configuring TCP Timeout parameters by OS
Windows NT/2000- On Windows NT/2000, the TCP timeout value defaults to 3000 milliseconds, or 3
seconds. However, the operating system retries the connection twice before failing, and on each retry it waits
twice the value of the previous wait. This means that the default value of 3000 milliseconds (3 seconds) is in
actuality a timeout value of 3000+6000+12000, or 21 seconds.
This value was hard coded prior to Windows NT Service Pack 5. In Windows NT Service Pack 5 and
higher, or Windows 2000, the following procedure can be used to view/modify this value:
1. Start Registry Editor (Regedt32.exe).
2. Locate the following key in the registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
3. On the Edit menu, click Add Value, and then add the following registry value:
Value Name: InitialRtt
Data Type: REG_DWORD
Value: <value> decimal (where <value> is the timeout value in milliseconds. For example, 5000 to set it to
5 seconds)
Valid Range: 0-65535 (decimal)
Default: 0xBB8 (3000 decimal)
Description: This parameter controls the initial retransmission timeout used by TCP on each new
connection. It applies to the connection request (SYN) and to the first data segment(s) sent on each
connection.
4. Quit Registry Editor.
5. Restart the computer.
AIX - On AIX, the TCP timeout value is specified in half-seconds, and can be viewed or modified using the
network options (no) command. The default value is 150 half-seconds (75 seconds).
To view the timeout value enter the command
/usr/sbin/no -o tcp_keepinit
To set the timeout value enter the command
/usr/sbin/no -o tcp_keepinit=<value>
where <value> is the timeout value in half-seconds. This is a runtime value, which must be set overtime the
machine reboots.
92
Solaris - On Solaris, the timeout value defaults to 180000 milliseconds, or 3 minutes. This value can be
viewed or set using the ndd command.
To view the timeout value enter the command
ndd -get /dev/tcp tcp_ip_abort_cinterval
To set the timeout value enter the command
ndd -set /dev/tcp tcp_ip_abort_cinterval <value>
where <value> is the timeout value in milliseconds.
HP-UX - On HP, the timeout value defaults to 75000 milliseconds, or 75 seconds. This value can be viewed
or set using the ndd command.
To view the timeout value enter the command
ndd -get /dev/tcp tcp_ip_abort_cinterval
To set the timeout value enter the command
ndd -set /dev/tcp tcp_ip_abort_cinterval <value>
where <value> is the timeout value in milliseconds.
93
Appendix C - MC/ServiceGuard setup instructions
Ÿ
Install MC/Serviceguard software to each node with swinstall and choose B3935DA package. For
MC/Servicegard installation details, see the manual with MC/Serviceguard software.
Ÿ
Configure and update each node for the MC/Serviceguard cluster
Give security permissions for both machines by adding entries into /etc/cmclustercmclnodelist file:
Hp1.somecorp.com
Hp2.somecorp.com
root
root
# WebSphere database cluster
# WebSphere database cluster
If you want to allow non-root users to run cmviewcl, you should also add non-root user Ids into this file.
Define name resolution service. By default, MC/ServiceGuard uses /etc/resolv.conf to obtain the
addresses of the cluster nodes. In case DNS is not available, you should configure /etc/hosts file and
configure /etc/nsswitch.conf to search the /etc/hosts file when other lookup strategies are not working.
Ÿ
Set up and Configure the shared disk array
Connect the shared disk array to both nodes.
Create volume groups, logical volumes and mirrors by pvcreate, vgcreate, vgextend,
lvcreate, lvextend.
Create cluster lock disks.
Distribute volume groups to other node. You can distribute volume groups either by SAM or by LVM
commands
Ÿ
Configure MC/Serviceguard cluster for WebSphere databases
Use SAM, select Cluster -> High Availability Cluster
Choose Cluster Configuration
Selection Actions menu, and choose create cluster configuration, and follow the instructions
Verify the cluster configuration by
cmeckconf -k -v -C /etc/cmcluster/webspheredb2.config for IBM DB2
cmeckconf -k -v -C /etc/cmcluster/websphereoracle.config for Oracle
Distribute binary configuration file to other node either by SAM or by commands
94
Backup volume group and cluster lock configuration data for possible replacement of disks later on
Ÿ
Configure packages and their services
Install DB2 or Oracle in both machines and LDAP into the shared disk
Create database instances into the shared lvg
Use SAM to configure packages
Customizing the package control scripts for vg activation, service IPs, volume groups, service start, and
service stop. Since the control scripts are very long, we give key functions of our sample scripts for DB2
and Oracle as follows:
For DB2, our sample service start script is
function customer_defined_run_cmds
{
su - db2inst4<<STARTDB
Db2start
STARTDB
test_return 51
}
And our sample DB2 service stop script is
function customer_defined_halt_cmds
{
su - db2inst4<<STOPDB
db2 force applications all
sleep 1
Db2stop
STOPDB
test_return 52
}
For oracle, our sample service start script is
function customer_defined_run_cmds
{
su - oracle<<STARTDB
lsnrctl start
export SIDS="APP ADMIN SESSION"
for SID in $SIDS ; do
export ORACLE_SID=$SID
echo "connect internal\nstartup\nquit" | svrmgrl
Done
STARTDB
test_return 51
}
And our sample Oracle service stop script is
95
function customer_defined_halt_cmds
{
su - oracle<<STOPDB
export SIDS="APP ADMIN SESSION"
for SID in $SIDS ; do
export ORACLE_SID=$SID
echo "connect internal\nshutdown\nquit" | svrmgrl
Done
lsnrctl stop
STOPDB
test_return 52
}
Distribute the package configuration by SAM
Ÿ
Verify that cluster operation and configuration to ensure
Heartbeat networks are OK and up
Networks are OK and up
All nodes are OK and up
All properties configured are what you want
All services such as DB2, Oracle, LDAP are OK and up
Logs should not have errors
Ÿ
Verify system failover from SAM by moving packages from one node to another node.
96
Appendix D HACMP setup instructions
Ÿ
Install HACMP 4.3.1 or HACMP 4.4 or HACMP/ES 4.4 ptf3
Use smit to install HACMP 4.3.1 or HACMP 4.4 or HACMP/ES 4.4pft3 into both nodes. For
installation details, please see HACMP for AIX Installation Guide. You also can install HACMP after you
configure network adapter and shared disk subsystem.
Before you configure HACMP, network adapters must be defined, AIX operating system must be
updated, and you must give permission for clustering nodes to access from one
another.
Modify the following configuration files: /etc/netsvc.conf, /etc/hosts, and
/.rhosts. Make sure
that each node’s service adapters and boot addresses are listed in the
/.rhosts file on each
cluster node so that the /usr/sbin/cluster/utilities/clruncmd command
and the
/usr/sbin/cluster/godm can run.
Ÿ
Service network configuration
Public network is used to provide services to clients (WebSphere, Applications, LDAP), for example, we
define two TCP/IP public networks in our configuration. Public network consists of a service/boot adapter
and any standby adapters. It is recommended for you to use one or more standby adapters. Define
standby IP addresses and boot IP addresses. For each adapter, use smit mktcpip to define IP label, IP
address, and network mask. HACMP will define service IP address. Since such configuration process
also changes hostname, configure the adapter with the desired default hostname last. Then use smit chinet
to change service adapter so that they will boot from the boot IP addresses. Check your configuration by
"lsdev -Cc if" Finally try to ping nodes to test the public TCP/IP connections.
Ÿ
Serial network configuration
A serial network is needed to exchange heartbeat messages between the nodes in a HACMP cluster. A
serial network allows Cluster Manager to continuously exchange keepalive packets should the
TCP/IP-based subsystem, networks, or network adapters fail. The private network can be either a raw
RS232 serial line or target mode SCSI, or target mode SSA loop. The HACMP for AIX serial line (a
null-model, serial to serial cable) is used to connect the nodes. Use smit tty to create the tty device. After
creating the ttydevice on both nodes, test communication over the serial line by entering the command stty
</dev/ttyx on both nodes (where /dev/ttyx is the newly added tty device). Both nodes should display their
tty settings and return to prompt if the serial line is OK. After testing, define the RS232 serial line to
HACMP for AIX.
Ÿ
Shared disk array installation and LVG configuration
The administrative data, application data, session data, LDAP data, log files and other file systems that
need to be highly available are stored in the shared disks that use RAID technologies or are mirrored to
protect data. The share disk array must be connected to both nodes with at least two paths to eliminate the
single point of failure. We use IBM 7133 Serial Storage Architecture (SSA) Disk Subsystem.
97
You can either configure the shared volume group to be concurrent access or nonconcurrent access.
Nonconcurrent access environment typically uses journaled file systems to manage data, while concurrent
access environments use raw logical volumes. There is a graphical interface called TaskGuide to simplify
the task of creating a shared volume group within an HACMP cluster configuration. In version 4.4, the
TaskGuide has been enhanced with automatically creating a JFS log and displaying the physical location of
available disks. After one node has been configured, import volume groups to the other node by using smit
importvg
Ÿ
DB2 or Oracle and LDAP installation, configuration, instance and database creation
For installation details, see the manuals for these products. You can install these products either on the
disks in both nodes or on the shared disk. But you must keep all the shared data such as database files,
transaction log files, and other important files on the shared disk array so that another node can access
these data when the current node fails. We choose to install these products on each node. In this case, you
must install the same version products on both nodes.
Create DB2 or Oracle instances on the shared disk subsystem. We created three DB2 instances for
administrative database, application database, session database for WebSphere. On the other test with
Oracle, we also created three Oracle Instances for administrative database, application database, session
database for WebSphere. You may need to change applheapsz for the created DB2 database or cursor
parameters for Oracle. See the WebSphere installation guide for details.
Install database clients in WebSphere nodes and configure database clients to connect to database server
if “thick” database clients are used. For example, install DB2 clients on all WebSphere nodes and catalog
remote node and database server.
Ÿ
Define the cluster topology and HACMP application servers
The cluster topology comprises of cluster definition, cluster nodes, network adapters, and network
modules. The cluster topology is defined by entering information about each component in
HACMP-specific ODM classes. These tasks can be done by using smit HACMP. For details see the
HACMP for AIX Installation Guide
An “application server”, in HACMP or HACMP/ES, is a cluster resource that is made highly available by
the HACMP or HACMP/ES software. For example, in our case, it is DB2 databases, Oracle databases,
and LDAP servers, not to be confused with WebSphere application server. Use smit HACMP to define
the HACMP (HACMP/ES) application server by a name and its start script and stop script.
Ÿ
Start and stop scripts for both DB2 and Oracle as the HACMP application servers
Our sample DB2 service start script is
db2start
98
And our sample DB2 service stop script is
db2 force applications all
db2stop
For oracle, our sample service start script is
lsnrctl start
export SIDS="APP ADMIN SESSION"
for SID in $SIDS ; do
export ORACLE_SID=$SID
echo "connect internal\nstartup\nquit" | svrmgrl
Done
And our sample Oracle service stop script is
export SIDS="APP ADMIN SESSION"
for SID in $SIDS ; do
export ORACLE_SID=$SID
echo "connect internal\nshutdown\nquit" | svrmgrl
done
lsnrctl stop
You must be db2 or oracle users to use the above scripts, otherwise you need to become such user by su
Ÿ
Define and configure resource groups
For HACMP and HACMP/ES to provide highly available application server service, such service has a
set of cluster-wide resources essential to uninterrupted processing. The resource group can have both
hardware and software resources such as disks, volume groups, file systems, network addresses, and
application servers themselves.
The resource group is configured in way that has a particular kind of relationship with a set of nodes. There
are three kinds of node relationships: cascading, concurrent access, or rotating. For the cascading resource
group, setting cascading without fallback (CWOF) attribute will minimize the client failure time. We used
this configuration in our tests. Use smit to configure resource groups and resources in each group. Finally,
you need to synchronize cluster resources to send the information contained on the current node to the
other node.
Ÿ
Cluster verification
Use /usr/sbin/cluster/daig/clverify on one node to check that all cluster nodes agree on the cluster
configuration and the assignment of HACMP for AIX resources. You also can use smit HACMP to verify
the cluster. If all node do not agree on the cluster topology and you want to define the cluster as it is
defined on the local node, you can force agreement of cluster topology onto all nodes by synchronizing the
99
cluster configuration. After the cluster verification is OK, Start the HACMP cluster services by using smit
HACMP on both nodes, and monitor the log file by $tail -f /tmp/HACMP.out and check database
processes by $ps -ef | grep db2 or ora
Ÿ
Takeover verification
To test a failover, use smit HACMP to stop cluster service with takeover option. On the other node, enter
the following command to watch the takeover activity: #tail -f /tmp/HACMP.out
You have several places to look up log information: /usr/adm/cluster.log provides a high-level view of
current cluster status. It is a good place to look first when diagnosing a cluster problem. and
/tmp/HACMP.out is the primary source of information when investigating a problem.
You also can configure a remote machine to be able to connect to HACMP cluster and use clstat ( or
xclstat) to monitor and test HACMP cluster.
100
Appendix E - Microsoft Clustering Setup Instructions
The installation process must be followed exactly as presented here in order to assure accurate installation of
the products. Read through the entire procedure and make sure you understand each aspect before starting the
process. Once you start, it will be difficult to go back without starting from scratch.
Verifying the Hardware/Software
Before installing any aspect of this environment, you want to verify that the hardware and software you are
about to work with is the proper version.
The following links provide a list of prerequisites necessary for this installation.
Ÿ Hardware Compatibility List (HCL) for Microsoft Windows 2000 Advanced Server
http://www.microsoft.com/windows2000/server/howtobuy/upgrading/compat/default.asp
Ÿ HCL for Microsoft Clustering Service
http://www.microsoft.com/hcl/default.asp (Search on "cluster")
Ÿ SQL Server Enterprise Edition Prerequisites
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/instsql/in_overview_74vn.as
NOTE: You must use SQL Server Enterprise Edition to get Clustering support. The Developer Edition
and Professional Edition are not cluster aware.
Setup SCSI Hardware for Shared Disk
With the testing that was done for this document, the IBM EXP 15 disk array was used as the shared disk. To
get both of the nodes to utilize this disk array appropriately, the SCSI cards in each node needed to be
configured to avoid a conflict. The following list are the SCSI options that were changed.
On the EXP 15, all of the DIP switches were set to off. This provided one chain of 10 drives in the array. On
node 1 of the cluster, we set the SCSI ID (in the SCSI BIOS) to 6 and on the other node to 7. This was the
only change from the default SCSI setup we needed to do.
Installing Microsoft Clustering Services
Once you have all of your prerequisites accounted for, installing the Microsoft Clustering Services in Windows
2000 is fairly straightforward. I would recommend you read through the documents provided in the Useful
Resources section of this document to learn more about the process if you have any questions.
During setup of our environment, we followed the Microsoft Clustering Step-by-Step guide (see Useful
Resources). The process outlined in that paper is fairly good. A list of additional notes we gathered while
running through this process is below.
1.
Make sure you have a server running the Active Directory service and a DNS server. This is necessary for
adding the cluster nodes to a Windows 2000 domain and for providing domain wide user accounts that
many of these services will run under.
101
Add the Cluster IP Address and Hostname to the DNS or hosts file of the second node to allow it to find
the active cluster during the installation.
3. SQL Server depends on the Microsoft Distributed Transaction Coordinator for distributed queries and
two-phase commit transactions, as well as for some replication functionality. At this point you need to
install MSDTC by opening a command prompt and running comclust.exe on each node in the cluster.
(comclust.exe can be found in the Winnt\system32 directory.)
2.
Installing IBM DB2 Enterprise Extended Edition on the Clustered Windows Servers
Installing IBM DB2 in a Windows 2000 cluster is a little different than installing on a single node. Follow the
instructions in the IBM Whitepapers on Implementing IBM DB2 on a Windows Cluster found in the resources
section along with the IBM DB2 documentation for installing to a cluster. I would recommend reading both of
the papers (2 node and 4 node) since they both have information that the other doesn’t have.
A list of additional notes we gathered while running through this process is below.
1. When creating a new datasource or WebSphere repository which will use the databases on the cluster, be
sure to catalog the database using the Virtual IP address of the DB2 servers as the remote host IP.
2. Add the virtual name/IP that you identified in the DB2 Server setup to the DNS server or hosts files on all
nodes that need to see this cluster.
3. The account used for DB2 Server service should be a domain login.
4. When running through the Verification tests (later in this paper), be sure that you set the DB2 Environment
variable DB2_FALLBACK to ON using db2set DB2_FALLBACK=ON otherwise the system may not
failover if a client is connected.
5. Be sure to add the DB2MPP-1 service as a clustered resource if it is not done through the DB2MSCS
tool.
Installing SQL Server on the Clustered Windows Servers
Installing SQL Server in a Windows 2000 cluster is a little different than installing on a single node. Follow the
instructions in the SQL Server documentation for installing to a cluster.
A list of additional notes we gathered while running through this process is below.
1. Make sure both nodes can see the other nodes' c$ shares before starting the install. This requires having
the "File and Print Sharing" installed on the network driver.
2. Add the virtual name/IP that you identified in the SQL Server setup to the DNS server or hosts files on all
nodes that need to see this cluster.
3. The login for SQL Server should be a domain login.
4. Use both the SQL Server authentication and NT authentication to allow for the WebSphere JDBC driver
to login.
102
Installing Merant SequeLink Service on the SQL Cluster Servers
After installing Microsoft SQL Server on the cluster, you are now ready to install the Merant SequeLink
Server on both nodes of the SQL Server cluster. Since the Merant SequeLink Server is not cluster aware, you
need to install it separately on each node of the cluster.
1.
2.
3.
4.
5.
6.
Download the Merant SequeLink Server from the WebSphere e-fix page. (A link to this page can be
found in the Useful Resources section at the end of this document.)
Unzip the Merant SequeLink files to a temporary directory.
Run the SequeLink setup by opening a command prompt, changing to the directory where you extracted
the files and run setup /v"IPE=NO" (This command prevents the setup from asking for a registration
key.)
Click through the Welcome page and accept the license.
At the following step in the install, you must change the drive that the Merant server is installed to, to the
shared drive. Here the shared drive is drive U.
The next step asks for the Agent name/port, Server name/port, and an account to administer the server. I
would recommend leaving the default settings unless this would cause a port conflict or there are security
requirements that force you to change these. The most important aspect of this page is the user account.
Be sure to enter an account from the domain not the local computer. This is so that when a failover occurs
the account will still have the proper permissions to administer the new node and this account will still be
103
available after a failure of the first node.
After clicking the Next button, the install will begin.
This process only installs Merant SequeLink on one node. Even though we are installing the files to the
shared drive, values are entered in the registry on each system. At this point, it is recommended that you
backup the swandm.ini file found on the shared drive in the Program Files\MERANT\slserver51\cfg
directory.
9. To finish the cluster install of Merant, failover to the second node and run through the exact same process
to install the registry settings. Be sure to use the EXACT same install path location. This will overwrite the
files on the shared drive but doing this does not cause any damage.
10. After installation has completed, go to the Windows Services on each system and change both the
SLAgent51 and SLSQLServer51 to start manually.
7.
8.
Configuring Merant as a Cluster Resource
Once the Merant Server is installed on each node, you need to identify it as a cluster resource so that if it
would fail on a node, the cluster will force a failover.
1. Start up the Cluster Administrator. (Start->Programs->Administrative Tools->Cluster Administrator)
2. Right-click on the SQL Server Group and select New->Resource.
3. Fill in the appropriate values:
Name: SequeLinkAgent
104
Resource Type: Generic Service
Group: SQL Server
4. Assign both nodes as possible owners of this resource.
105
5. Add the following to the Resource Dependencies:
Disk U: (Quarum Disk)
SQL IP Address
SQL Network Name
106
6. Setup the service parameters.
Service Name: SLAgent51 (Must match the service that is installed)
107
7. You need to add a registry key to be replicated on a failover. Click ADD and enter in the string
SOFTWARE\MERANT\SequeLink51\SLSQLServer.
8. Click Finish.
Now you must follow the same process for the SequeLinkListener Service.
108
9. Follow steps 2-8 of the above procedure but when you get to the Generic Service Parameters (Step 6) ,
enter the following:
Service Name: SLSQLServer51
10. Click Finish. Merant is now setup as a resource in the cluster.
Additional Setup Steps For the Merant Driver
In addition to the setup already performed, the following steps need to be performed to get the environment to
function properly.
Changing Merant SequeLink to use TCP/IP Sockets
Merant SequeLink 5.1 Server defaults to using Named Pipes to connect to MS SQL server. To change it to
use TCP/IP you need to follow the steps outlined in the "How to Configure MERANT SequeLink Server
for TCP/IP Sockets Net-Library" in the Useful Resources section.
Change the Node Hostnames to Virtual Hostname in the swandm.ini File
After installing the Merant SequeLink Server, the configuration file swandm.ini (found at
SharedDrive:\Program Files\MERANT\slserver51\cfg) was written using the hostnames of each node in
the cluster. This will not work for the clustered environment since everything needs to be referenced using the
Virtual Hostname of the SQL Server.
109
Change all instances of specific nodes in swandm.ini to virtual IP/hostname (Some examples of where this
occurs are the serviceHost and ServiceConnectInfo listings.)
Setting up SQL Server Tables / Usernames
To create a database and user to use for WebSphere, follow the steps below.
1.)Start the SQL Server Enterprise Manager on the active SQL node.
2.)Expand the Console Root until you get to the running SQL Server object.
3.)Expand this object and right-click on the Databases Folder.
4.)Select New Database.
5.)Enter the database name, change any necessary parameters and click the OK button.
6.)This will create the new database to use.
Now you need to create a new user for this database that WebSphere will use to access this database.
1.)Using the tree in the left pane, expand the Security folder and right-click the Logins object.
2.)Select New Login.
3.)Enter a name for this login and have it use the SQL Server Authentication. Select the database you just
created as the Default database.
4.)Click on the Database Access tab and checkmark the database you just created. Give this user the
appropriate permissions to this database.
5.)Click OK and test that you can access the database with this user by using the Query Analyzer or some
other database client.
Configuring WebSphere to use SQL Server as an application Database
Configuring WebSphere to use the Merant Drivers to access a clustered SQL Server is almost identical to
setting up any other database driver/datasource. The main thing to remember is that you will be referencing the
virtual IP of the SQL Server cluster, not each node individually.
1.
2.
3.
4.
5.
6.
7.
Start out by installing the Merant database driver to each WebSphere node within your WebSphere
domain. To do this, start up the WebSphere Administrative console and expand the tree to WAS
Domain->Resources->JDBC Providers
Right-click on the JDBC Providers folder and select New.
Enter a name for the Driver such as "Merant Driver" and choose the Merant driver
com.merant.sequelink.jdbcx.datasource.SequeLinkDataSource as the Implementation Class.
Click on the Nodes tab and install the Merant Database Driver to the WAS Node, selecting the
appropriate jars files D:\WebSphere\AppServer\lib\sljc.jar;D:\WebSphere\AppServer\lib\sljcx.jar
After the driver has been installed, expand it and right-click on the Data Sources folder. Select New.
Fill in the appropriate information such as the Data Source Name, JNDI Name, Database Name, User
ID/password, etc. Additionally, be sure to add the following parameters to the Custom Properties.
serverName = <SQLServer Virtual Name or IP>
portNumber = 19996 (or whatever you identified when installing Merant)
disable2Phase = true
Setup any beans in your test application to use this datasource with the JNDI name you specified.
110
Configuring WebSphere to use SQL Server as an Administrative Repository Database
If you would like to use the clustered SQL Server database as the database for your WebSphere Application
Server repository, you have a couple of ways of specifying this. Starting with WAS 4.0, when you install WAS
you are given the option to use the Merant drivers as the jdbc driver for your repository. This is the easiest
way to setup WAS to use SQL Server as your repository database.
The other way is to install WebSphere using one of the other supported database and then use the Database
Conversion tool to change the appropriate settings and have WebSphere point to a different database. (See
the Useful Resources for a link to the Database Conversion tool.) Be sure to get the dbconfig4, not the
dbconfig conversion tool. Then follow the included instructions.
Verifying your configuration
Once you have everything setup, you will want to verify that the environment fails over correctly. The following
scenarios can be run to verify that if certain points in the environment failed, it would roll over to the functioning
node.
111
Ÿ
Ÿ
Ÿ
Ÿ
Manual Push to Passive Node - Within the cluster administrator, you can right-click on a group and
click "Move Group" to move it to the passive node.
Clean Shutdown of Active Node. Without manually moving any of the groups, go to Start->Shutdown
and power down the Active Node.
Unexpected power failure on Active Node. Physically pull the power cable from the Active Node.
Public network cable failure on the Active Node. Only pull the public network cable from the Active
Node.
The Microsoft cluster should recognize that one (or more) of the resources failed on the active node and
transition all of the components from the failing node to the alternate node. At this point, all the connections to
WebSphere are broken, so on the next request from WebSphere, stale connections would be detected and a
StaleConnectionException would be thrown to the WebSphere Application (as described in Chapter 6). After
the transition to the new active node was completed WebSphere would reestablish connections to the
database. Applications programmed according to the guidelines in Chapter 6 would also reconnect to the
database.
112
Resources
Ÿ
IBM Redbooks available from http://www.redbooks.ibm.com
Ÿ WebSphere Edge Server: Working with Web Traffic Express and Network Dispatcher (SG246172)
Ÿ WebSphere V4.0 Advanced Edition Handbook (SG24-6176-00)
Ÿ
WebSphere 4.0 InfoCenter
Ÿ http://www-4.ibm.com/software/webservers/appserv/infocenter.html
Ÿ
Microsoft HCL for Windows 2000
Ÿ http://www.microsoft.com/windows2000/server/howtobuy/upgrading/compat/default.asp
Ÿ
Microsoft HCL for Clustering
Ÿ http://www.microsoft.com/hcl/default.asp (Search on "cluster")
Ÿ
Introducing Windows 2000 Clustering Technologies
Ÿ http://www.microsoft.com/windows2000/techinfo/howitworks/cluster/introcluster.asp
Ÿ
Microsoft Clustering Step-by-Step guide
Ÿ http://www.microsoft.com/windows2000/techinfo/planning/server/clustersteps.asp
Ÿ
Recommended Private "Heartbeat" Configuration on a Cluster Server (Q258750)
Ÿ http://support.microsoft.com/default.aspx?scid=kb;en-us;Q258750.
Ÿ
WebSphere 4.0 Database Conversion Tool
Ÿ http://www-4.ibm.com/software/webservers/appserv/tools_intro.htm
Ÿ
Handling WebSphere Connections Correctly
Ÿ http://www7.software.ibm.com/vad.nsf/data/document4382?OpenDocument&p=1&BCT=1&Footer=1
Ÿ
How to Configure MERANT SequeLink Server 5.1 for MS SQL Server using TCP/IP Sockets Net-Library
Ÿ http://www-1.ibm.com/servlet/support/manager?rs=180&rt=0&org=SW&doc=1008271
Ÿ
Merant Server 5.1 Download on WebSphere E-fix page
Ÿ http://www-3.ibm.com/software/webservers/appserv/efix-archive.html#fp353
Ÿ
Configuring Merant to use TCP/IP instead of Named Pipes
Ÿ http://www-1.ibm.com/servlet/support/manager?rs=180&rt=0&org=SW&doc=1008271
Ÿ
Creating a Merant Datasource in WebSphere
Ÿ http://www-1.ibm.com/servlet/support/manager?rs=180&rt=0&org=SW&doc=1008413
Ÿ
Instructions on installing the Merant SequeLink 5.1 Server
Ÿ http://www7b.boulder.ibm.com/wsdd/library/techarticles/0109_hiranniah/0109_hiranniahpt1.html
Ÿ
Implementing IBM DB2 Universal Database Enterprise Extended Edition with Microsoft Cluster Server (4 Node Cluster)
Ÿ http://www-4.ibm.com/software/data/pubs/papers/#mscseee
Ÿ
Implementing IBM DB2 Universal Database Enterprise Edition with Microsoft Cluster Server (2 Node Cluster)
Ÿ http://www-4.ibm.com/software/data/pubs/papers/#mscs
113