(Page 1 of 1 in this chapter) Version


Chapter 3

Health Management Interface


3.1 HMI Service Description
3.2 HMI Programming Model
3.2.1 Unsolicited Status Events
3.2.2 Application Requests
3.3 API Functions and Events
3.3.1 ctaOpenServices
3.3.2 Events
3.3.3 Sample Setup

3.1 HMI Service Description

The Health Management feature is composed of several pieces of software. The txmon task executes on the TX board, monitoring its mate and the SS7 tasks. The txmon process signals to a host process every 200 milliseconds. Heartbeats contain the current run state (primary, backup, task failure, etc.) of the board and current link (connected or isolated) state of the IBC link.

The HMI service running on the host watches these heartbeats for state changes as well as for the absence of the heartbeats. Any change of either state generates an asynchronous event to all registered applications.

The HM API allows applications to control the behavior of the system and to monitor the performance of the HM system. There are many events that are generated to registered applications. For example, a separate task failure (HMI_EVN_TASKDEAD) event is passed up that is different from the event generated by the absence of heartbeats (HMI_EVN_BRDDEAD). An application can ask the HMI for status to determine which task failed when the HMI_EVN_TASKDEAD event is generated. An application also can control the behavior of the HM system with primitives such as hmiPrimary and hmiBackup.

For information on installing the HMI service, see Chapter 6.

3.2 HMI Programming Model

The HM API allows applications - both call processing type applications and user interface, or OA&M type applications - to perform certain requests to control the operation of the signaling subsystem, as well as monitor for "unsolicited" system status events. Separate connections, or handles, are used for the two types of operations. The application interface is intended to support both user-supplied call processing applications, which need to monitor the system status and take appropriate actions, and user interface applications, which need to be able to display the current system status and statistics, as well as initiate switchovers and resets. The HM application interface makes no distinction between these two types of applications - all registered applications can receive unsolicited status events and can issue any of the supported operation requests. There are a maximum number of connections (for either type of operation) to the HMI service available to all the applications. HMI supports up to 16 simultaneous application connections. The HM API is a CT Access service and behaves accordingly.

3.2.1 Unsolicited Status Events

To register for unsolicited status events, the application calls the ctaOpenServices primitive. This creates a connection to the HMI service and returns a CT Access handle that can be used to "wait" for incoming messages (i.e., ctaWaitEvent). Whenever ctaWaitEvent returns/completes, the application has received an event on which to possibly take action. To disconnect from the HMI service (such as when the application shuts down), the application calls the ctaCloseServices primitive; this frees up the connection slot for use by other applications.

The HMI service reports the current run state of the boards for which an application registers. For example, if a board is in the primary state, the HMI_EVN_PRIMARY event is generated to the registering application.

3.2.2 Application Requests

The other API requests - status, statistics, and control requests - are comprised of messages exchanged between the calling application and the HMI service. These primitives require a separate connection to the HMI service and operate in a blocking fashion, similar to remote procedure calls. That is, the primitive sends a request to the HMI service and waits for the response message (up to 15 seconds), blocking the calling process or thread.

The application calls the ctaOpenServices primitive to establish a connection to the HMI service. Once the API has been opened, the application may call any of the other API primitives. Each request primitive generates a message to the HMI service and waits for the response message - either a confirm response for a successful operation or a refuse response for an unsuccessful operation. The application terminates the connection to the HMI service by calling the ctaCloseServices primitive.

For some request primitives, it may take several seconds to receive a response message. For some applications, blocking for this length of time is not appropriate. These applications should spawn a separate thread to perform the primitive call, ensuring that the main or worker threads are not blocked for long period of time.

Also, sharing the handle returned from ctaOpenServices among several threads, each of which might generate independent requests, is not recommended, as the response messages may get mixed up among the requesting threads. In this case, each thread should perform its own ctaOpenServices and use its own handle.

Status and statistics requests are acceptable in any state by HMI and the appropriate response message is returned on the connection between the HMI service and the application.

Reset requests are used to instruct the HMI service to re-read its configuration file. The request is primarily intended to provide an operational means to address unplanned conditions or configuration updates. The primary/backup state information is not reset at this point.

3.3 API Functions and Events

This section describes in detail the functions and events that are used by the health management API.

3.3.1 ctaOpenServices

There are a couple of parameters that must be passed to the ctaOpenServices call in the svcargs.args element of the CTA_SERVICE_DESC structure, as explained in the following table:
Element

Description

args[0]

This value should be a unique index for each connection made to the HMI service, 0-15 (there is a maximum of 16 open connections).

args[1]

Identifies which board this channel will manage. If this is a call to open an event channel, this can be set to HMI_EVENTS_ALL_BOARDS to receive events for all boards managed by the HMI service.

args[2]

This value should be set to HMI_RCV_EVENTS to receive events, or set to HMI_DO_COMMANDS in order to perform actions.

The CT Access service name is hmi and the CT Access service manager name is hmimgr. These names should be placed in the CTA_SERVICE_DESC structure when opening the service. These names can also be edited into the cta.cfg file in order to facilitate using the tracing service, as shown in the following sample code:

[ctasys]
Service = adi, adimgr
Service = hmi, hmimgr    # trace the HM API

TraceMask = 0
StartWebServer = 1       # Change to 0 to disable ctdaemon web server.
StartTraceServer = 1     # Change to 0 to disable ctdaemon trace server.
HttpPort  = 1100         # TCP/IP port for web server.
TracePort = 1101         # TCP/IP port for trace server.
TraceMaxControllers = 1  # Num. clients allowed to set tracemask.
TraceMaxMonitors = 10    # Num. clients allowed to monitor trace msgs.
[ctapar]
[eof]

See the CT Access Developer's Reference Manual for more details.

3.3.2 Events

As previously stated, an application can register for asynchronous events. The registering application can request all events for all configured TX boards or can register for specific boards, one at a time. The event and board number are the only relevant elements in events received by the application, in the event and value fields of the CT Access CTA_EVENT structure. Here is a complete table of the possible events:
Event Code

Description

HMI_EVN_ISOLATED

The mate board is now unavailable from this board

HMI_EVN_CONNECTED

The mate board is now available from this board

HMI_EVN_NOWPRIMARY

This node is now the primary node (but application layers may not yet be available)

HMI_EVN_BRDDEAD

The board is dead, should reload

HMI_EVN_TASKDEAD

A task on board is dead, should reload

HMI_EVN_HALTED

The previously requested halt is finished

HMI_EVN_LOADING

The previously requested load has started

HMI_EVN_NOWBACKUP

This node is now the backup node

HMI_EVN_NOWSTANDALONE

This node is now stand alone

HMI_EVN_STARTING

This node is freshly started

HMI_EVN_CONFLICT

There is confusion in the state

HMI_EVN_SERVICE_DOWN

This node is being shutdown now; all API connections should be closed. The board number is not relevant for this event.

HMI_EVN_STOP

Application should close communications to the TX board (Close services)

HMI_EVN_INSERT

TX board has been inserted

HMI_EVN_EXTRACT

TX board extraction pending

3.3.3 Sample Setup

The following code sample opens the service for boards 1 and 2 and then registers for events from all boards. In this sample, after opening these services, a loop watching for events simply prints out the data from the events as they are received.

Ret = ctaCreateQueue( NULL, 0, &FstQueue );

Ret = ctaCreateContext( FstQueue, 0, NULL, &CtaHd[0] );
Ret = ctaCreateContext( FstQueue, 0, NULL, &CtaHd[1] );
Ret = ctaCreateContext( FstQueue, 0, NULL, &CtaHd[2] );

hmiOpenSvcLst[0].svcargs.args[0] = (DWORD)(1);
hmiOpenSvcLst[0].svcargs.args[1] = 1; /* board */
hmiOpenSvcLst[0].svcargs.args[2] = HMI_DO_COMMANDS; /* synchronous */
Ret = ctaOpenServices( CtaHd[0], &hmiOpenSvcLst[0], 1 );
   
/* Wait for service open to complete. */
hmiTstWaitForEventWithReason( FstQueue, CTAEVN_OPEN_SERVICES_DONE, CTA_REASON_FINISHED );

hmiOpenSvcLst[0].svcargs.args[0] = (DWORD)(2);
hmiOpenSvcLst[0].svcargs.args[1] = 2; /* board */
hmiOpenSvcLst[0].svcargs.args[2] = HMI_DO_COMMANDS; /* synchronous */
Ret = ctaOpenServices( CtaHd[1], &hmiOpenSvcLst[0], 1 );
    
/* Wait for service open to complete. */
hmiTstWaitForEventWithReason( FstQueue, CTAEVN_OPEN_SERVICES_DONE, CTA_REASON_FINISHED );

hmiOpenSvcLst[0].svcargs.args[0] = (DWORD)(3);
hmiOpenSvcLst[0].svcargs.args[1] = HMI_EVENTS_ALL_BOARDS; /* board */
hmiOpenSvcLst[0].svcargs.args[2] = HMI_RCV_EVENTS; /* asynchronous */Ret = ctaOpenServices( CtaHd[2], &hmiOpenSvcLst[0], 1 );
    
/* Wait for service open to complete. */
hmiTstWaitForEventWithReason( FstQueue, CTAEVN_OPEN_SERVICES_DONE, CTA_REASON_FINISHED );

/*
 *  Wait for events until the ticker on all context is done.
 */
stop = 0;
while ( !stop )
{
 hmiTstWaitForEvent( FstQueue, &Event, CTA_WAIT_FOREVER );
 printf("Received event %x from board %d\n", Event.id, Event.value );
 if (Event.id == HMI_EVN_SERVICE_DOWN)
   stop = TRUE;
}

Figure 4 shows the arrangement between the CT Access queues and contexts for the previous code sample:

Figure 4. Relationship Between CT Access Queues/Contexts and HMI Service


In a single node system, an application could control both boards as in the previous example. This application could take appropriate actions to recover from board failures (by making backup boards primary, reloading dead boards, etc.). In a dual node system, it is the responsibility of the applications to choreograph failovers and switchovers when desired or made necessary by failures. It is important to take into consideration these situations when designing the user part application as well. See Appendix B for an application that can be run in dual node or single node environments.



(Page 1 of 1 in this chapter) Version


tech_support@nmss.com
Copyright © 2000, Natural MicroSystems, Inc. All rights reserved.