IP in the VoIP Environment
Session Initiation Protocol (SIP), which is defined in Internet Engineering Task Force (IETF) RFC 3261, is now in contention for leadership as the most widely used voice over IP call control protocol. SIP is a text-based protocol, similar to HTTP and SMTP, designed to manage multimedia sessions over the Internet and address advanced audio services, such as conferencing and announcements. While SIP is responsible for determining the peer IP address and port number on which to communicate, it does not perform the actual physical transport of the media. This is usually done via TCP/IP.
SIP methods (messages) establish connections between two or more SIP User Agents, where a User Agent can be a SIP phone, a PC running a SIP client, or a gateway. SIP provides standard telephony service features such as call hold, unattended transfer, call forward, and three-way conference, as well as advanced features such as find-me, presence, and call screening. Many of SIP's features are managed using SIP servers.
Figure 1: SIP Call Flow
(click to view larger image)
SIP Call Signaling Flow
Figure 1 illustrates the steps involved in setting up and terminating a call using SIP. With time progressing down the page, it illustrates a call placed over an IP network between two telephones on the PSTN. The initiation of a call from the PSTN side (Area A in Figure 1) drives the start of the SIP call. On the "caller" side, the stack issues an INVITE message to initiate a call to the "called" end. The INVITE message comprises the basic call set-up information, such as the called and caller addresses, proxy information, and vocoder details to negotiate a common media vocoder format between the two sides. The called end responds with a SIP TRYING message to acknowledge the call request. Signaling then proceeds on the called side to set up the call and generate ring.
When ringing is successful, a RINGING message is returned to the caller side, where the appropriate ringback can be generated to the caller (Area B in Figure 1). When the called side goes off hook, a connect is performed on this side, enabling the media stream. The SIP stack generates an OK message to indicate this state change. In response, the caller side connects the call and acknowledges the connect with an ACK message. A two-way voice and/or video connection is now established.
The call continues without any SIP message exchange until the called side hangs up, which results in the SIP stack generating a BYE message (Area C in Figure 1). The caller side proceeds with the disconnect on its side and acknowledges the call termination with an OK message. Refer to RFC 3261 for details about the content and format of these messages.
Using SIP in the Natural Access Development Environment
With the release of Natural Access 2005-1, NMS offers developers two options for using SIP in their application:
- Choose a SIP stack from another source and use the application to bridge the stack to Natural Access
- License an NMS-provided SIP stack, which is accessed through the Natural Call Control (NCC) API
Using a Third-Party SIP Stack
There are a number of SIP stack vendors to choose from, each with their own advantages and disadvantages. NMS has written demonstration applications with two SIP stacks, proving the ease of using the application to bridge between the SIP stack and the Fusion voice-over-IP software and Natural Access development environments. To assist developers in the design of SIP-controlled User Agents, NMS created the Fusion SIP Sample Application (FSSA, which is available on the Fusion 4.2 software download page). FSSA demonstrates a simple call set up between two parties using a Fusion-based gateway and a SIP stack. The following section describes this sample application and demonstrates the ease of integrating an "independent" SIP stack with Fusion to create a SIP-controlled gateway.
Fusion-SIP Gateway Sample Application (FSSA) Overview — Multimedia gateways include both control and media functions, both of which are tied together by the application as demonstrated in the Fusion SIP Sample Application. Control functions are managed by the SIP stack using an API to process call signaling messages across the Internet. Media functions are
controlled through the Fusion API which is responsible for the transmission of voice or data and the associated encode or decode operations.
Fusion includes the Media Stream Packet Processing protocol (MSPP), which allows the easy creation of end points and a virtual channel to connect the two end points. Typical gateway end points are a DS0 digital voice stream and a RTP packet processing engine. Functions such as compression and decompression are contained in the virtual channel between the end points. Figure 2 illustrates how the application connects the control and media functions together.
Figure 2: Gateway Control and Media Functions
(click to view larger image)
Licensing an NMS-provided SIP Stack
Beginning with Natural Access 2005-1, NMS has licensed and integrated a SIP stack from a leading vendor, RADVISION. The developer's application has access to that stack through the Natural Call Control API. ISDN and CAS call control protocols are also available through the NCC API. NCC isolates the developer from the details of the protocol, thus making application development quicker and easier. To illustrate the level of abstraction, this text shows an NCC connection setup call and the resulting SIP INVITE message generated.
The NMS SIP service also includes an SDP (Session Description Protocol) stack that the application can use for encoding and decoding the media negotiation fields included within the body of a SIP message. The negotiation consists of one User Agent proposing a media session (specifying media type, codec, IP address and port number for a media stream) in the SDP body attached to the SIP message and the other User Agent either accepting or rejecting the proposed media session.
Rapid SIP Development with Open Access
It is clear that SIP has emerged as the standard for VoIP call control and next-generation service creation, supporting interoperability with existing telephony systems and mobility. Open Access gives you freedom to choose either your own SIP stack or a fully integrated version from NMS, simplifying development and reducing time-to-market through universal call control of both IP and TDM applications.
Definitions, Acronyms, and Abbreviations
FSSA: Fusion SIP Sample Application used to demonstrate using SIP stack with NMS products
Fusion: Trademark for NMS' VoIP gateway development and runtime environment
H.323: Protocol originally for conferencing, now used for VoIP
HTTP: HyperText Transfer Protocol; text-based protocol between web server and web browser
IETF: Internet Engineering Task Force, which writes standards such as SIP and MGCP
IP: Internet Protocol
ITU: International Telecommunications Union; establishes international communications standards
MSPP: Media Stream Packet Processing protocol used on Fusion
PSTN: Public Switched Telephone Network
RFC: Request for Comment; the name for IETF standards
RTP: Realtime Transport Protocol for streaming realtime multimedia over IP in packets
SDP: Session Description Protocol, a text-based description language used for media negotiation between User Agents
SIP: Session Initiation Protocol
SMTP: Simple Mail Transfer Protocol, TCP/IP protocol for sending email between servers