Chapter 2. Overview of high availability concepts 15
availability on the order of, for example, 24x7x365 availability, the duration of a
planned outage can be minimized by applying techniques such as rolling
upgrades for software and hot replacement of hardware. Additionally, these
events can be scheduled during periods of lighter system load.
Reasons for planned outages include:
The installation of software upgrades or patches
Periodically scheduled backups
Hardware repairs or expansions
The physical movement of applications of hardware
Unplanned outages are the more urgent concern as they may occur at a time
when the system is required to be operational. The major contributors to
unplanned downtime are:
Hardware failure
Software failure
Human error
Environmental conditions: climate control, power failure, power spike
Natural disasters: fire, flood, earthquake
2.4 The focus of this redbook
In this book, the primary focus is on solutions to help manage hardware and
software failure. Examples of the types of failures the given solution addresses
are:
Network failures
Disk failures
Processor failures
Application software defects
System software defects
There are several different ways to accomplish the various levels of system
availability. These include the combination of software, hardware, and
operational procedures that work together.
2.5 Clustering for high availability
Clustering solutions that provide failover support are the prevalent mechanism
used to achieve high levels of availability.
16 Highly Available WebSphere Business Integration Solutions
IBM High Availability Cluster Multi-Processing for AIX (HACMP) is the platform
specific clustering solution provided by IBM for pSeries servers running AIX. HA
clustering solutions are provided by many vendors; among these solutions are:
Sun Cluster, Microsoft® Cluster Server (MSCS), VERITAS Cluster Server
(VCS), HP Serviceguard, and many others.
These clustering solutions are transparent to the applications that run under their
control. Applications do not have to be modified before they can be deployed
within an HA cluster.
Commercial HA clustering solutions allow for a wide range of cluster
configurations. The simplest cluster configuration is comprised of a pair of
servers, and an external SCSI or Fibre Channel storage subsystem.
The following terms are generally used when discussing cluster technologies.
Cluster
A cluster is a grouping of two of more interconnected computers that are viewed
and used as a single computing resource.
Node
A node is an individual computer system (including its hardware resources such
as local disks, processors, memory, etc.) which runs both the operating system
and the corresponding clustering software.
Nodes can be uniprocessors, or as in the case of a large scalable system,
symmetric-multi processors (SMP). An SMP node looks like a uniprocessor node
to most of the clustering technologies. Each node in the cluster can vary in the
number of processors. The number of nodes that can participate in a single
cluster varies depending on the clustering technology being implemented.
Resource groups
Resource groups enable the combination of related resources into a logical
entity. Most of the available systems clustering technologies employ the concept
of moving the system resources that are contained within a resource group.
Resource groups facilitate moving critical workload from one node to another,
and the continuation of critical workload processing on a different cluster node.
This concept will become more prevalent as we discuss system and application
availability configurations, especially in the case of MQ and Message Broker.
Get Highly Available WebSphere Business Integration Solutions now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.