Sybase Business Intelligence Solutions - Database Management, Data Warehousing Software, Mobile Enterprise Applications and Messaging
Sybase Brand Color Bar
delete

Search for    in all of Sybase.com
view all search results right arrow
  blank
 
 
 
 
 
 
 
 
 
 

 
 
CLICK TO EXPAND NAVIGATION
CLICK TO EXPAND NAVIGATION
 
 
 
 
Support > Technical Documents > Document Types > White Paper-Technical > Configuring Replication Server 12.1 for High Avail...

Configuring Replication Server 12.1 for High Availability on Sun Cluster 2.2

This white paper provides background and procedures for configuring Sybase Replication Server for high availability (HA) on Sun Cluster 2.2
 
RSS Feed
 
 
 

Topic

Page

Introduction

Terminology

Technology overview

Configuring Replication Server for high availability

Administering Replication Server as a data service

Introduction

This White Paper assumes that:

  • You are familiar with Sybase Replication Server. This White Paper does not explain the steps necessary to install Sybase Replication Server.

  • You are familiar with Sun Cluster HA. This White Paper does not explain the steps necessary to install Sun Cluster HA.

  • You have a two-node cluster hardware with Sun Cluster HA 2.2.

Documentation references:

  • Sun Cluster 2.2 Software Planning and Installation Guide

  • Sun Cluster 2.2 System Administration Guide

  • Configuring Sybase Adaptive Server Enterprise 12.0 Server for High Availability: Sun Cluster HA (see White Papers under www.sybase.com/products/databaseservers/ase).

  • Replication Server documentation (see Product Manuals under www.sybase.com/products/eaimiddleware/replicationserver).

Terminology

This document uses these terms:

  • Cluster – multiple systems, or nodes, that work together as a single entity to provide applications, system resources, and data to users.

  • Cluster node – a physical machine that is part of a Sun Cluster. Also called a physical host.

  • Data service – an application that provides client service on a network and implements read and write access to disk-based data. Replication Server and Adaptive Server Enterprise are examples of data services.

  • Disk group – a well-defined group of multihost disks that move as a unit between two servers in an HA configuration.

  • Fault monitor – a daemon that probes data services.

  • High availability (HA) – very low downtime. Computer systems that provide HA usually provide 99.999% availability, or roughly five minutes unscheduled downtime per year.

  • Logical host – a group of resources including a disk group, logical host name, and logical IP address. A logical host resides on (or is mastered by) a physical host (or node) in a cluster machine. It can move as a unit between physical hosts on a cluster.

  • Master – the node with exclusive read and write access to the disk group that has the logical address mapped to its Ethernet address. The current master of the logical host runs the logical host’s data services.

  • Multihost disk – a disk configured for potential accessibility from multiple nodes.

  • Failover – the event triggered by a node or a data service failure, in which logical hosts and the data services on the logical hosts move to another node.

  • Failback – a planned event, where a logical host and its data services are moved back to the original hosts.

Technology overview

Sun Cluster HA is a hardware- and software-based high availability solution. It provides high availability support on a cluster machine and automatic data service failover in just a few seconds. It accomplishes this by adding hardware redundancy, software monitoring, and restart capabilities.

Sun Cluster provides cluster management tools for a System Administrator to configure, maintain, and troubleshoot HA installations.

The Sun Cluster configuration tolerates these single-point failures:

  • Server hardware failure

  • Disk media failure

  • Network interface failure

  • Server OS failure

When any of these failures occur, HA software fails over logical hosts onto another node and restarts data services on the logical host in the new node.

Sybase Replication Server is implemented as a data service on a logical host on the cluster machine. The HA fault monitor for Replication Server periodically probes Replication Server. If Replication Server is down or hung, the fault monitor attempts to restart Replication Server locally. If Replication Server fails again within a configurable period of time, the fault monitor fails over to the logical host so the Replication Server will be rebooted on the second node.

To Replication Server clients, it appears as though the original Replication Server has experienced a reboot. The fact that it has moved to another physical machine is transparent to the users. Replication Server is affiliated with a logical host, not the physical machine.

As a data service, the Replication Server includes a set of scripts registered with Sun Cluster as callback methods. Sun Cluster calls these methods at different stages of failover:

  • FM_STOP – to shut down the fault monitor for the data service to be failed over.

  • STOP_NET – to shut down the data service itself.

  • START_NET – to start the data service on the new node.

  • FM_START – to start the fault monitor on the new node for the data service.

Each Replication Server is registered as a data service using the hareg command. If you have multiple Replication Servers running on the cluster, you must register each of them. Each data service has its own fault monitor as a separate process.



Note:

For detailed information about the hareg command, see the appropriate Sun Cluster documentation.


Configuring Replication Server for high availability

This section describes the tasks required to configure a Replication Server for HA on Sun Cluster (assuming a two-node cluster machine).

Configuring Sun Cluster for HA

The system should have the following components:

  • Two homogenous Sun Enterprise servers with similar configurations in terms of resources like CPU, memory, and so on. The servers should be configured with cluster interconnect, which is used for maintaining cluster availability, synchronization, and integrity.

  • The system should be equipped with a set of multihost disks. The multihost disk holds the data (partitions) for a highly available Replication Server. A node can access data on a multihost disk only when it is a current master of the logical host to which the disk belongs.

  • The system should have Sun Cluster HA software installed, with automatic failover capability. The multihost disks should have unique path names across the system.

  • For disk failure protection, disk mirroring (not provided by Sybase) should be used.

  • Logical hosts should be configured. Replication Server runs on a logical host.

  • Make sure the logical host for the Replication Server has enough disk space in its multihosted disk groups for the partitions, and that any potential master for the logical host has enough memory for the Replication Server.

Installing Replication Server for HA

During Replication Server installation, you need to perform these tasks in addition to the tasks described in the Replication Server installation guide:

  1. As a Sybase user, load Replication Server either on a shared disk or on the local disk. If it is on a shared disk, the release cannot be accessed from both machines concurrently. If it is on a local disk, make sure the release paths are the same for both machines. If they are not the same, use a symbolic link, so they will be the same. For example, if the release is on /node1/repserver on node1, and /node2/repserver on node2, link them to /repserver on both nodes so the $SYBASE environment variable is the same across the system.

  2. Add entries for Replication Server, RSSD server, and primary/replicate data servers to the interfaces file in the $SYBASE directory on both machines. Use the logical host name for Replication Server in the interfaces file.

  3. Start the RSSD server.

  4. Follow the installation guide for your platform to install Replication Server on the node that is currently the master in the logical host. Make sure that you:

    1. Set the environment variables SYBASE, SYBASE_REP, and SYBASE_OCS:

      setenv SYBASE /REPSERVER1210
      setenv SYBASE_REP REP-12_1
      setenv SYBASE_OCS OCS-12_0

      where /REPSERVER1210 is the release directory.

    2. Choose a run directory for the Replication Server that will contain the Replication Server run file, configuration file, and log file. The run directory should exist on both nodes and have exactly the same paths on both nodes (the path can be linked if necessary).

    3. Choose the multihosted disks for the Replication Server partitions.

    4. Initiate the rs_init command, from the run directory:

      cd RUN_DIRECTORY
      $SYBASE/$SYBASE_REP/install/rs_init

  5. Make sure that Replication Server is started.

  6. As a Sybase user, copy the run file and the configuration file to the other node in the same path. Edit the run file on the second node to make sure it contains the correct path of the configuration and log files, especially if links are used.



    Note:

    The run file name must be RUN_repserver_name, where repserver_name is the name of the Replication Server. You can define the configuration and log file names.


Installing Replication Server as a data service

You also need to perform these specialized tasks to install Replication Server as a data service:

  1. As root, create the directory /opt/SUNWcluster/ha/repserver_name on both cluster nodes, where repserver_name is the name of your Replication Server. Each Replication Server must have its own directory with the server name in the path. Copy the following scripts from the Replication Server installation directory $SYBASE/$SYBASE_REP/sample/ha to:

    /opt/SUNWcluster/ha/repserver_name

    on both cluster nodes, where repserver_name is the name of your Replication Server:

    repserver_start_net
    repserver_stop_net
    repserver_fm_start
    repserver_fm_stop
    repserver_fm
    repserver_shutdown
    repserver_notify_admin

    If the scripts already exist on the local machine as part of another Replication Server data service, you can create:

    /opt/SUNWcluster/ha/repserver_name

    as a link to the script directory instead.

  2. As root, create the directory /var/opt/repserver on both nodes if it does not exist.

  3. As root, create a file /var/opt/repserver/repserver_name on both nodes for each Replication Server you want to install as a data service on Sun Cluster, where repserver_name is the name of your Replication Server. This file should contain only two lines in the following form with no blank space, and should be readable only by root:

    repserver:logicalHost:RunFile:releaseDir:SYBASE_OCS:SYBASE_REP

    probeCycle:probeTimeout:restartDelay:login/password

    where:

    • repserver – the Replication Server name.

    • logicalHost – the logical host on which Replication Server runs.

    • RunFile – the complete path of the runfile.

    • releaseDir – the $SYBASE installation directory.

    • SYBASE_OCS – the $SYBASE subdirectory where the connectivity library is located.

    • SYBASE_REP – the $SYBASE subdirectory where the Replication Server is located.

    • probeCycle – the number of seconds between the start of two probes by the fault monitor.

    • probeTimeout – time, in seconds, after which a running Replication Server probe is aborted by the fault monitor, and a timeout condition is set.

    • restartDelay – minimum time, in seconds, between two Replication Server restarts. If, in less than restartDelay seconds after a Replication Server restart, the fault monitor again detects a condition that requires a restart, it triggers a switch over to the other host instead. This resolves situations where a database restart does not solve the problem.

    • login/password – the login/password the fault monitor uses to ping Replication Server.

    To change probeCycle, probeTimeout, restartDelay, or login/password for the probe after Replication Server is installed as data service, send SIGINT(2) to the monitor process (repserver_fm) to refresh its memory.

    kill -2 monitor_process_id

  4. As root, create a file /var/opt/repserver/repserver_name.mail on both nodes, where repserver_name is the name of your Replication Server. This file lists the UNIX login names of the Replication Server administrators. The login names should be all in one line, separated by one space.

    If the fault monitor encounters any problems that need intervention, this is the list to which it sends mail.

  5. Register the Replication Server as a data service on Sun Cluster:

    hareg -r repserver_name \

    -b "/opt/SUNWcluster/ha/repserver_name" \

    -m START_NET="/opt/SUNWcluster/ha/repserver_name/
    repserver_start_net" \

    -t START_NET=60 \

    -m STOP_NET="/opt/SUNWcluster/ha/repserver_name/
    repserver_stop_net" \

    -t STOP_NET=60 \

    -m FM_START="/opt/SUNWcluster/ha/repserver_name/
    repserver_fm_start" \

    -t FM_START=60 \

    -m FM_STOP="/opt/SUNWcluster/ha/repserver_name/repserver_fm_stop" \

    -t FM_STOP=60 \

    [-d sybase] -h logical_host

    where -d sybase is required if the RSSD is under HA on the same cluster, and repserver_name is the name of your Replication Server and must be in the path of the scripts.

  6. Turn on the data service using hareg -y repserver_name.

Administering Replication Server as a data service

This section describes how to start and shut down Replication Server as a data service, and useful logs for monitoring and troubleshooting.

Data service start/shutdown

Once a Replication Server is registered as data service, use:

hareg -y repserver_name

to start Replication Server as a data service. This starts Replication Server if it is not already running, and also starts the fault monitor for Replication Server.

To shut down Replication Server, use:

hareg -n repserver_name

The fault monitor restarts or fails over this Replication Server if it is shut down or stopped (killed) any other way.

Logs

There are several logs you can use for debugging:

  • Replication Server log – the Replication Server logs its messages here. Use the log to find informational and error messages from Replication Server. The log is located in the Replication Server Run directory.

  • Script log – the data service START and STOP scripts log messages here. Use the log to find informational and error messages that result from running the scripts. The log is located in /var/opt/repserver/harep.log.

  • Console log – the operating system logs messages here. Use this log to find informational and error messages from the hardware. The log is located in /var/adm/messages.

  • CCD log – the Cluster Configurations Database, which is part of the Sun Cluster configuration, logs messages here. Use this log to find informational and error messages about the Sun Cluster configuration and health. The log is located in /var/opt/SUNWcluster/ccd/ccd.log.


 

DOCUMENT ATTRIBUTES
Last Revised: Mar 19, 2001
Product: Replication Server
Hardware Platform: Sun Solaris SPARC
Technical Topics: Middleware, 800-database, High Availability, Replication/Synchronization
  
Business or Technical: Technical
Content Id: 1011844
Infotype: White Paper-Technical
 
 
 

 
© Copyright 2014, Sybase Inc. - v 7.6 Home / Contact Us / Help / Jobs / Legal / Privacy / Code of Ethics