Central Service - Failover

Central Service - Failover - Overview (Enterprise only)

MOVEit Central has the capability of running in "failover mode", in which one server automatically stands in for another if the first one fails.

Requirements

MOVEit Central Failover requires:

Two computers running Windows Server 2008 and using the same type of database server(for supported operating systems and databases, see Requirements). Note: If you need to migrate MOVEit Central to a new server, these KnowledgeBase articles can help:
- Migrating MOVEit Central to a New Server
- How do I get MOVEit Central to work with Microsoft SQL Server?
MOVEit Central. The failover capability is built into all versions of MOVEit Central; however, it must be enabled by a special license key.
A license key that enables the failover capability. The same license key can be used on both computers.
A TCP connection between the two computers that allows access to ports 3472 and 3473. Also, if you use the MySQL database, NetBIOS over TCP must be enabled between the two machines to allow the very useful "Copy Database From" action to occur.

See Failover Installation for how to install failover.

How it works

In a failover configuration, there are two computers running MOVEit Central. At any given time, one of them is the "primary node" and the other is the "secondary node". The primary node is responsible for running all tasks, and for accepting all connections from MOVEit Central Admin users. The secondary node is passive: its only responsibility is to maintain an up-to-date copy of the primary node's settings, and to promote itself to the primary node if the other node fails. The secondary node does not run tasks or allow MOVEit Central Admin to make changes to its configuration.

The secondary node connects to the primary node via the same TCP interface that is used by MOVEit Central Admin. It uses this interface to determine the health of the primary node: if it cannot connect for a few minutes, it will assume that the primary is dead and will become the primary itself. The secondary node also uses this TCP interface to replicate changes on-the-fly from the primary node. Also, both nodes use this interface to ensure that there is exactly one primary node at all times. This prevents a situation in which both nodes are primary, and potentially transferring files twice.

Status information on the failover aspects of the system is available on the Failover tab of MOVEit Central Admin.

What gets replicated

The following settings are automatically replicated from the primary node to the secondary node:

The configuration file miccfg.xml, which contains most MOVEit Central settings, including definitions of tasks, hosts, scripts, SSH keys, date lists, and so on.
The state file micstate.xml, which contains date stamps that MOVEit Central uses to determine whether files are new.
The tamper detection file michash.xml, which contains information used to detect tampering of the database.
The PGP keyrings, PGPPath\secring.pgp and PGPPath\pubring.pgp. These files contain the PGP keys used by MOVEit Central, if the optional PGP capability has been licensed.
The MICSTATS database, a database which contains a record of tasks that were run, files that were transferred, and administrator actions that were performed.
Creation, deletion, and other manipulation of local Windows users and groups used to access MOVEit Central. (Domain users and groups don't need to be replicated as long as both failover nodes are members of the same domain.)
SSL certificates, in the Microsoft Windows certificate store. This includes:
- Client certificates, with private keys, optionally used to identify MOVEit Central when connecting to secure FTP and MOVEit DMZ servers.
- Server certificates, with private keys, used to secure communications with MOVEit Central Admin.
- Other people's certificates, without private keys, used for sending S/MIME email (a rarely used capability).

CentralResil_TwoBox_400px.gif (21842 bytes)

The following are not replicated:

The registry key, which can be found in one of the following locations:
- HKEY_LOCAL_MACHINE\Software\Standard Networks\MOVEitCentral
- HKEY_LOCAL_MACHINE\Software\Wow6432Node\Standard Networks\MOVEitCentral
This key contains infrequently-changed settings such as the license key, the directory used for temporary files, and so on. These settings are maintained by the Configure MOVEit Central program. If you run this configuration program and make changes on one node, you should make those same changes to the other node.
The temporary "cache" directory. Any files stored temporarily by a running task are not replicated to the secondary node. The state of any tasks which were running will be lost. Those tasks will be run at the next normally scheduled interval on the secondary node after it takes over.

Database replication

Note: If you use Microsoft SQL Server, a single database is shared by Node 1 and Node 2 (and generally installed on a third node), so database replication is unnecessary.

If you use the MySQL database, the database is replicated by the primary node sending SQL statements to the secondary node, which runs them itself on its own copy of the database. (Replication features built in to the database are not used.) During the usually short time between the original update of the primary database and the corresponding update of the secondary database, the SQL statements are stored in an encrypted file named MICSQL.blg. This buffering of SQL statements allows the replication to be done at a later time if the secondary node is down.

After a failover

When the secondary node becomes a primary node, it enables the task scheduler, allowing tasks to be run. It also begins accepting connections from MOVEit Central Admin. Because the other node is dead, the new primary node will initially not be able to replicate changes to it.

When the dead node comes back to life, it checks with the other running node before deciding whether to be the primary or secondary node. Assuming that the other node is still running in primary node, the formerly dead node will become secondary and will catch up on any changes made while it was down.

Failover alerts

If a MOVEit Central failover occurs, you should expect to see the following message from the following MOVEit Central node sent to the email address configured on the "Errors" tab of the MOVEit Central Configuration utility.

From the SECONDARY NODE: "I cannot contact the other MOVEit Central node. I was the secondary node, but I'm becoming primary." Upon receipt of this message, an administrator should take whatever steps necessary to restore the MOVEit Central service on the old primary node.

When the MOVEit Central service is restored on the old primary node, the following messages from the following nodes are to be expected:

From the PRIMARY (old secondary) NODE: "I was finally able to login to the remote MOVEit Central."
From the SECONDARY (old primary) NODE: "Other node is running as primary. Even though we were primary last time, we'll be secondary now."

Resynchronizing

See the "Failover - Common Procedures - Resynchronization" documentation for this procedure. Also, see "Central Service - Failover - Common Procedures - Node Swap" for instructions on switching the primary / secondary roles of two nodes.

MOVEit Central Admin considerations

The Failover Tab allows you to monitor failover status. Additionally, if you connect to a failover-enabled node that is not running in primary mode, all tabs other than Log and Failover will be empty.

How to Make Two Centrals Appear As One

Firewalls and internal servers may only be configured to accept connections from a single IP address. In this case installers may wish to set their network up in such a way that any non-MOVEit Central machine sees the MOVEit Central cluster as a single IP address. The best way to do this is to use a router that does network address translation and overloading of a single address with multiple sessions. (Cisco calls this "PAT" or "NAT overloading").

CentralResil_PAT_400px.gif (10719 bytes)

For example, if Central node #1 has IP address 192.168.5.1 and node #2 has IP address 192.168.5.2, a router can be configured to overload IP address 192.168.6.3. 192.168.6.3 would be the only IP address the rest of the world would know about; neither of the 192.168.5.* addresses would need to be configured in any IP-specific firewall or configuration.