(Page 1 of 1 in this chapter) Version


Chapter 1

Introduction


1.1 Overview
1.2 System Requirements

1.1 Overview

The SS7 Redundancy/Health Management subsystem is a set of hardware and software components that support the development of distributed, highly available call processing systems employing SS7 signaling. These systems can detect and recover from signaling link failures, board failures, and even node failures without a total service outage. This architecture also facilitates the design of systems whose hardware and/or software components can be upgraded, or whose call-handling capacity can be increased or decreased.

The core of the architecture is an extended SS7 software capability that allows two TX boards to be paired in a primary/backup arrangement. The boards are connected by a private high speed Ethernet link that allows them to exchange signaling messages and state information.

The TX boards may be spread across two signaling nodes (multi-chassis) or may be located in the same signaling node (single chassis). The two boards appear to the rest of the SS7 network as a single signaling point (SP) with a single point code.

Signaling links are distributed across both the primary and backup boards. The SS7 MTP 2 layer is active on both boards, allowing all configured signaling links to be active and eliminating the need for provisioning of spare signaling links.

The SS7 MTP 3 layer message routing and management functions are operational only on the primary board. Link and route status changes are checkpointed to the backup MTP 3 layer to ensure that it has up-to-date network status information in case of a primary board outage.

The SS7 ISUP layer also operates in a primary/backup fashion, with all circuit switched connections managed by the active board. Call state information can be checkpointed by the local application to the backup ISUP entity, through extensions to the normal call processing APIs, so that stable calls may be preserved across a signaling board or node outage.

The SS7 SCCP layer operates along with other SS7 layers in a two-board redundant primary/backup configuration, in addition to the current single-board standalone configuration. The objective of the redundant configuration is to maintain the SCCP service across a failure. The backup SCCP layer can re-synchronize its internal state with the primary SCCP layer in cases where communication with the primary board is lost and then re-established, or when the backup board has been reloaded due to a failure or routine maintenance.

The TCAP layer also operates in a primary/backup mode. To allow a backup TCAP task to immediately take over service, the primary TCAP task sends checkpoint messages to inform the backup task of changes in various TCAP transactions. Additionally, if the primary and backup tasks become disconnected (due to a failure, or a reloading of the backup board), the backup task will be able to retrieve the current transaction states for all of the transactions on the primary task.

The health management (HM) subsystem consists of a daemon process, which constantly monitors the status of the boards, and an application programming interface (API), which allows applications to control the operation of the boards and continuously monitor their status. The health management subsystem can also be used in a non-redundant single board configuration, known as a standalone configuration, to monitor and control the board.

Operation of the signaling subsystem is under complete control of the local signaling application(s). The application designates each board as either the primary or backup after it is downloaded. During normal operation, applications using SCCP can behave in the normal fashion. There are no checkpointing responsibilities, other than updating a backup host in the dual chassis arrangement (if necessary). For class 0 connectionless service, best effort delivery service is maintained across switchovers. No other state information, other than the accessible/inaccessible status of the remote SP/SSN, will be maintained between primary and backup SCCP layers.

For class 1 connectionless service, SLS values assigned to a sequence are not retained across switchovers, so no checkpointing of SLS assignments (SCLI data structures) is required. The backup must, however, avoid re-using frozen segmentation local references (those recently assigned by the primary) for some period after a switchover, so their usage must be synchronized with the backup application.

In general, for both classes of connectionless service, messages may be lost on a switchover. Any detection and recovery of lost messages is the responsibility of the application-level protocol running above SCCP.

For both classes of service, segmented messages in the process of being transmitted or received are lost or discarded on a switchover. If the remaining segments of a partially reassembled incoming message that was lost/discarded due to a switchover are received by the (new) primary, these are detected and discarded. If any of these segments has the return option set, then it is returned to the sender in an XUDTS message with a return cause of Segmentation failed for ITU or Error in message transport for ANSI.

During normal operation, applications using TCAP can behave in the normal fashion. TCAP Transaction information will be checkpointed by the Primary TCAP task, and will be configurable. A user application may configure each User SAP to, by default, checkpoint all transactions, checkpoint only those initiated by the application, or checkpoint no transactions. The default checkpoint action may be overridden by a user application, which can checkpoint transactions on an individual basis.

A transaction may be checkpointed at any time during the transaction lifetime. For example, after a Begin message is received, the user application may send a Continue message and specify that the transaction now should be checkpointed. Although the Begin message was not checkpointed, the transaction will be checkpointed as the Continue message is sent. The TCAP task keeps track of which transactions are checkpointed and is sure to delete the checkpoints as the transactions are closed.

If using ISUP, the application should checkpoint call status changes to the ISUP layer on the backup board, as necessary to preserve stable calls. Upon detection of a failure of the primary signaling board (through the health management interface) or failure of the primary application/signaling node (through application-specific means), the application directs the backup signaling board to become the primary and take over signaling operations. Finally, when a failed signaling board is restored to service as the backup, the application may re-synchronize it with the primary board by checkpointing the state of each circuit through the call processing API extensions.

1.2 System Requirements

The SS7 Redundancy/Health management subsystem is supported on the TX 3000 (ISA), TX 3220 (PCI), and TX 3220C (compact PCI) boards. For redundant configurations, each TX board must be equipped with a 10Base-T or 10/100Base-T Ethernet adapter. Boards must be licensed for each of the desired SS7 layers as well as for redundant operation.

Note: For TX 3000 ISA bus boards, the presence of the Ethernet interface reduces the number of SS7 signaling links that can be terminated on a board by one. For example, a four-link board supports a maximum of three signaling links when configured with an Ethernet adapter for redundancy. It is important to note that the maximum number of signaling links can be spread across both boards. That is, eight links can be configured on each board for a maximum of sixteen links.



(Page 1 of 1 in this chapter) Version


tech_support@nmss.com
Copyright © 2000, Natural MicroSystems, Inc. All rights reserved.