(Page 1 of 1 in this chapter) Version


Appendix B

RMG Sample Application


Introduction
RMG State Model
RMG Process Functional Description
Startup/Initialization
Failure Detection and Recovery
Running the RMG Sample Application
Commands
Tracing

Introduction

The RMG sample application is provided as an example management application for controlling a redundant board pair with the HM API. Each instance of the RMG application controls one member of a board pair and communicates with a peer RMG process that controls the mate board.

Note: The RMG application is provided solely as an example application for illustrating control of a redundant board-pair through the HM API and as an aid for prototyping redundant configurations. It is not guaranteed to be complete or failure resilient, and is not suitable for live system deployment.

The RMG process can operate in either a single node or dual node configuration. One instance of the RMG process is run for each board. The RMG process communicates with its mate RMG process via UDP/IP using the sockets interface (even when both processes reside on a single node). Together, the RMG processes implement the failure detection and recovery policies recommended for a redundant configuration.

The RMG process also provides a simple command line interface for issuing HM commands (load a board, halt a board, retrieve board status) and switching control between the primary and backup boards.

Use of the RMG process requires TCP/IP networking and a sockets implementation (Windows Sockets version 1.1 or later for Windows NT, stands BSD sockets library for UNIX) installed. For dual-node configurations, a suitable IP connection between nodes, such as a local area network, is required.

The RMG application can also be used in a standalone configuration (without the mate process) for detecting and recovering from board failures without user intervention.

RMG State Model

The behavior of each signaling node, as implemented through the RMG process, is modeled as a finite state machine where the state of each node is determined by external events such as board failures, signaling node failures, and user commands. The RMG state model is illustrated in Figure 17.

Note: Some transient states and some events/transition are not shown.

Figure 17. RMG State Model


The following table defines the RMG states:
State Name

Description

Initial

Initial state upon starting RMG process; determining if board has already been loaded

Loading

Board is being downloaded

Starting

Board is loaded; determining whether mate node is already active or not

Standalone

Board is in standalone (non-redundant) configuration

Primary

Board is in primary mode

Backup

Board is in backup mode, monitoring status of primary board

Out of Service

Board has failed and attempt to reload it has failed; or, halt command received; manual intervention is required to restore board.

RMG Process Functional Description

The following sections describe the RMG process functionality.

Startup/Initialization

The goal of the initialization phase is to allow independent starting/restarting of signaling nodes, resulting in a synchronized system (i.e., one in which both nodes agree on which is the primary and which is the backup) which restores signaling functionality as quickly as possible.

During initialization, a RMG process attempts to contact its mate, to determine if the mate is already primary. If no response is received or a communication error occurs, the RMG process will delay for a short period and retry. If the retry is unsuccessful (or the mate believes it is the backup node), the restarting board becomes the primary board. The delay and retry is necessary to avoid the situation where both nodes initialize simultaneously, are initially unable to contact each other, and both become primary.

To resolve startup "glare", where both nodes initialize, each RMG process is assigned a node number (1 and 2). The lower numbered node (node 1) becomes that primary node and higher number node becomes the backup when startup glare is detected.

Failure Detection and Recovery

To facilitate failure detection and recovery, the primary RMG process monitors the its board through the HM API. In addition, the primary RMG periodically sends "heartbeat" messages to the backup, allowing the backup to monitor the primary's status.

When the primary RMG process detects that the board has failed, or a reload or halt command is received, it initiates failure recovery by negotiating a switchover to the backup board. If possible, the failed board is reloaded and brought back into service as the backup board (unless a halt command was received, in which case the board is halted and remains out of service).

The RMG process (both primary and backup) also supports a planned changeover command, which causes the primary and backup boards to switch roles.

In order to detect a failure of the primary RMG process or signaling node, the backup RMG continuously monitors for the receipt of "heartbeat" messages from the primary. If none are received for some time (e.g., 5 consecutive heartbeat periods) the backup initiates its own recovery, switching into primary mode.

Running the RMG Sample Application

The RMG process is started from a Windows NT command-line console or a UNIX command line prompt.

RMG [-b board] [-l loc_port] [-m mate_addr] [-n node] [-p remote_port] [-t]

All run time parameters are optional and are defined in the following table.

Note: Each instance of the RMG process monitors and controls a single board.
Parameter

Description

Default value

-b board

The board number to monitor on this node

1

-l loc_port

The local UDP port for this process to attach to

1700

-m mate_addr

The IP address (in "q.x.y.z" dotted notation) or host name of the mate RMG process

None - If omitted, it is assumed that the mate RMG process exists on the same node.

-n node

The node number [1..2] assigned to this RMG process for resolving startup glare

1

-p remote_ port

The UDP port where the mate RMG process may be found

1700 - Note that if two RMG processes are executed on the same node, they MUST be assigned different UDP port numbers.

-t

Enables tracing to the ctdaemon process (described below)

No tracing.

Once running, the RMG process displays error messages and status change messages in the console window where it was started. The following example shows the sample output from the RMG process:

C:\TEKTX> rmg -b2 -m node1 -n2 -t
rmg: Redundancy manager version 1.0 May  4 1999
Node: 2, Board 2: Board Halted
Node: 2, Board 2: Board Loading
Node: 2, Board 2: Now Starting
Node: 2, Board 2: Board Isolated
Node: 2, Board 2: Now Primary

RMG>

Commands

The RMG process includes as simple command line user interface for issuing HMI commands.

Note: The RMG process does not automatically issue a prompt unless the user presses <return> (to prevent scrambling messages being displayed). User commands, however, can be entered at any time, with or without the prompt.

The RMG process supports the following commands. Each command may be abbreviated with the single character shown in bold in the following table. A list of available commands can be displayed with the HELP (?) command.
Command

Description

STATUS

Displays current board status/statistics

CHANGE

Swaps current primary and backup, if possible

HALT

Halts the board, taking it out of service

LOAD

Reloads the board

RESET

Resets the HMI service

QUIT

Quits this process, without disturbing board

HELP | ?

Displays list of available commands

Tracing

Tracing of events processed by the RMG process can be enabled by starting the CT Access ctdaemon process and running the RMG process with the -t option. This can be helpful in understanding the sequence of events in certain scenarios.

Note: Configuring and starting the ctdaemon process is described in the CT Access Developer's Reference Manual.

The following example shows the sample trace output from the ctdaemon process when the RMG process is run with tracing enabled.:

CT Access Daemon V.5 (Mar  4 1999)
ctdaemon: Configuration file './cta.cfg':
      [ctasys] section loaded.
ctdaemon: Configuration file './cta.cfg':
      [ctapar] section loaded.
ctdaemon> MESG: Thu May 13 10:31:10 1999
  | pid=6a tid=75 ctahd=80010002 (RMGCMD) uid=0 tag=4003 sev=0
  | DEBUG: RMG Controller FSM started
MESG: Thu May 13 10:31:10 1999
  | pid=6a tid=75 ctahd=80010002 (RMGCMD) uid=0 tag=4003 sev=0
  |  RMGC State: INITIAL       Event: Board Halted      
MESG: Thu May 13 10:31:10 1999
  | pid=6a tid=75 ctahd=80010002 (RMGCMD) uid=0 tag=4003 sev=0
  |  RMGC State: LOADING       Event: Board Loading     
MESG: Thu May 13 10:31:14 1999
  | pid=6a tid=75 ctahd=80010002 (RMGCMD) uid=0 tag=4003 sev=0
  |  RMGC State: LOADING       Event: Now Starting      
MESG: Thu May 13 10:31:14 1999
  | pid=6a tid=75 ctahd=80010002 (RMGCMD) uid=0 tag=4003 sev=0
  |  RMGC State: STARTING      Event: Board Isolated    
MESG: Thu May 13 10:31:17 1999
  | pid=6a tid=75 ctahd=80010002 (RMGCMD) uid=0 tag=4003 sev=0
  |  RMGC State: STARTING      Event: Timer_T1          
MESG: Thu May 13 10:31:20 1999
  | pid=6a tid=75 ctahd=80010002 (RMGCMD) uid=0 tag=4003 sev=0
  |  RMGC State: STARTING      Event: Timer_T1          
MESG: Thu May 13 10:31:20 1999
  | pid=6a tid=75 ctahd=80010002 (RMGCMD) uid=0 tag=4003 sev=0
  |  RMGC State: ACTIVE        Event: Now Primary       



(Page 1 of 1 in this chapter) Version


tech_support@nmss.com
Copyright © 2000, Natural MicroSystems, Inc. All rights reserved.