StarQuest Technical Documents

SQDR Plus High Availability

Last Update: 20 December 2013
Product: SQDR Plus
Version: 4.0 & later
Article ID: SQV00PL014

 

Abstract

This article describes the High Availability with Load Balancing option introduced in SQDR Plus Version 4.13 and SQDR 4.10. This option is supported for DB2 for i and DB2 for LUW sources.

Solution

The High Availability option is based upon an N+1 redundant design – with load balancing across all N+1 servers.

This function permits two SQDR Plus installations to be configured for High Availability with Hot-Backup and automatic failover. In a High Availability deployment a particular Agent may be designated as an Active server and other agents as Passive servers, at a journal-level. Since High Availability deployment require multiple agents to be deployed, agents may be configured to be active for some journals and passive for others, thus permitting load to be distributed across multiple agents during normal operation.

For example (in an N=1 scenario, i.e. 1+1) if SQDR Plus is deployed on machines A & B and Machine “A” is designated as Active agent for journals JRN01 and JRN02 and Machine “B” is designated as Active for Journals JRN03 and JRN04; Machine “A” can also be defined as a Passive agent for Journals JRN03 and JRN04; while Machine “B” is defined as a Passive agent for journals JRN01 and JRN02.

In normal operation, Machine “A” is applying changes to the destination database for JRN01 & JRN02 and Machine “B” is applying changes to the destination database for JRN03 & JRN04.

Should Machine “A” fail, Machine “B” automatically becomes the Active agent for JRN01 & JRN02, and assumes the full responsibility for applying all change data to the destination database.

Once Machine “A” comes online, Machine “B” automatically becomes the Passive agent for JRN01 & JRN02, again.

Similarly, Should Machine “B” fail, Machine “A” automatically becomes the Active agent for JRN03 & JRN04, and assumes the full responsibility for applying all change data to the destination database.

Once Machine “B” comes online, Machine “A” automatically becomes the passive agent for JRN03 & JRN04 again.

Thus High Availability is achieved with an N+1 deployment with load balancing.

Sample Scenarios

The following drawings illustrate several sample scenarios.

In Scenario 1, we deploy multiple Windows machines functioning as Tier 2 (SQDR Plus) and Tier 3 (SQDR), with one machine designated as running the active agent. On each machine, SQDR is configured to use the local SQDR Plus agent. If the active machine should fail, then the second machine will automatically become the active agent.

In Scenario 2, we deploy multiple Windows or Linux machines functioning as Tier 2 (SQDR Plus). The Windows machine functioning as Tier 3 (SQDR) is a member of a Windows Failover Cluster. SQDR is configured with duplicate subscriptions, communicating with both agents.

SQDR Plus Configuration and Operation

Configuration:

The following parameters should be configured for SQDR Plus:

usingFailover

Set this parameter to true on both the active and passive agents to enable Failover.
Default value: false.

failOverSchema

Source database schema of the active agent. This parameter should be configured on both the active and passive agents. If the parameter is not set, the default is the sourceDbSchema of the agent being configured.
Default value: <sourceDbSchema>
Example: SQDR

Operation:

The summary view of an agent shows the current status of the Staging Agent. If the agent is active, You should see Capture Agent (Server status) as Running and Agent (TCP Listener) as Active.  If you have created subscriptions, the display
will show an active worker for each journal being monitored.

If you have configured the Agent for High Availability (usingFailover=true) you can select a worker and change its status to Standby from the Resource menu. The status of the Agent will be displayed as Standby:<name of active agent>.

Messages similar to the following will appear in the agent logs (Diagnostics):

WARNING: FailOverTable.makeStandby: Worker is now in standby mode. Worker=MYLIB.QSQJRN

WARNING: FailOverTable.poll: Taking over processing for worker MYLIB.QSQJRN from AGENTN
WARNING: CaptureAgent.active:MYLIB.QSQJRN
WARNING: FailOverTable.makeActive: Worker is now in active mode. Worker=MYLIB.QSQJRN

SQDR Configuration and Operation

Configuration:

The subscriptions in a given group must all be associated with a single journal.

After creating subscriptions on the active agent, create identical subscriptions that involve the passive agent:

  1. Pause the IR group using the active agent to avoid issues that might occur if incremental data is transmitted by the active agent while configuring the passive agent.
  2. Create a new SQDR source specifying the passive agent on the Advanced panel.
  3. Create subscriptions for the desired tables. Specify "manual baselines" for the destination if you are subscribing to large tables and wish to avoid running baselines when creating the second set of subscriptions.
  4. Select the new IR group and select "Run".
  5. Resume the IR group associate with the active agent.

Operation:

In standard operation, the icon for an incremental group or subscription is typically green (active/resumed) or yellow (paused).

The icon for a group or a subscription that is involved with HA agents will be blue (Standby-resumed) or orange (Standby-paused).

A group must be "resumed" in order to actively detect active/standby transitions. A standby group that is actively monitoring for a state transition is displayed as a blue circle with a replication statistic of "standby-enabled". A failover group that is paused is displayed as an orange circle with an exclamation point and a replication statistic of "standby-paused".

Because a resumption of active status may involve replay of transactions, a "fail-over" group should use "unique constraints" to avoid row count errors.



 


 


DISCLAIMER

The information in technical documents comes without any warranty or applicability for a specific purpose. The author(s) or distributor(s) will not accept responsibility for any damage incurred directly or indirectly through use of the information contained in these documents. The instructions may need to be modified to be appropriate for the hardware and software that has been installed and configured within a particular organization.  The information in technical documents should be considered only as an example and may include information from various sources, including IBM, Microsoft, and other organizations.