Feature Articles
Product News
Articles and Publications
Support Tip
Spotlight Events
SUBSCRIBE I UPDATE PROFILE I PAST ISSUES I READER SURVEY I DOWNLOAD PDF I RSS
By Maggie Smith, Director, Product Marketing, Developer Platforms
VoiceXML (VXML) is an open standard extensible markup language that grew from the increasing demand to easily create audio-based applications. Like its web language counterpart (HTML), VoiceXML’s main objective is to easily develop voice-based dialogs using the markup language model. VoiceXML is used primarily for the creation of interactive voice response (IVR) and is a natural fit to integrate with text-to-speech (TTS) synthesis and automatic speech recognition (ASR) servers. NMS's Vision VoiceXML Server supports voice and video VXML-based applications and, in Release 3.0, allows interactive video applications (IVVR) to use the same speech servers now used with interactive voice solutions.
Here is an example showing a more traditional voice-prompted VoiceXML application with VXML dialog prompts to choose a beverage. The user is asked by the voice dialog to choose a drink (either coffee, tea, or milk). The user can press a DTMF button or audibly request the preferred drink. The <choice> element within the menu shows which DTMF key is assigned to each drink.
<menu>
<prompt>
<par>
<media src=”choices.3gp”/> <!-- Video only -->
<media type=»application/ssml+xml»>
<speak version=»1.0»>
For tea, say tea or press 1,
For coffee, say coffee or press 2
For milk say milk or press 3
</speak>
</media>
</par>
</prompt>
<choice dtmf=»1» next=»#tea»>tea</choice>
<choice dtmf=»2» next=»#coffee»>coffee</choice>
<choice dtmf=»3» next=»#milk»>milk</choice>
When the user says “tea,” the speech recognizer will send a trigger to the voice browser. This trigger is then matched to the appropriate choice element and the VXML script in the menu dialog will traverse to the next dialog as defined by the next attribute within the choice element. Here the “#” symbol indicates that the next dialog is within the same document. This is similar to HTML when a link is referred to within the same page.
Now imagine this same VXML script, but with an interactive video stream instructing the user to choose a particular beverage by a telephone key pad stroke or requesting the drink by voice. After the user selects a drink, a second clip is displayed, confirming the beverage choice. The ASR server responds to the voice command for the beverage choice by translating the voice stream to correspond with the requested beverage and the VXML server then displays a video clip confirming the beverage choice.
Using video as a visual aid to ask for and confirm the choice reduces the time spent listening to voice dialog and prompts. In our example, the user makes a beverage selection faster due to the power of visual aids.
Video applications using the Vision VoiceXML Server Release 3.0 can also play audio tracks from an alternate source. This new feature will allow for additional customization, such as dubbing a translated language track over the local language video clip. Mobile TV clips that use local language such as in a news or sports broadcast might have wider appeal if a translation audio source is used and dubbed over the embedded audio within this clip.
The Vision VoiceXML Server Release 3.0 supports the latest industry standards, including VoiceXML 2.1 and 2.0, Media Resource Control Protocol (MRCP) to access speech recognition services, and Call Control eXtensible Markup Language (CCXML) to simplify the complexities of call flow and control. In addition, a wide range of voice and video encoders and file formats are available, including H.263, AMR, and .3gp for mobile handset and Internet Protocol (IP) deployment. The Vision VoiceXML Server extends the integration of video capabilities to include access to streamed video content servers, resulting in simplified connections to real-time video streams that are transmitted in the .3gp and H.263 formats for both handset and SIP deployments.
NMS’s Vision VoiceXML Server simplifies the technically challenging task of building interactive voice and video response applications by providing the key elements (e.g., VXML and CCXML) for rapid development of robust and dynamic applications that involve complex media processes and, with its support of ASR and TTS servers, expands the flexibility of an interactive voice and video application.
By Jon Mechling, Director of Product Management
Introduction to Mobile Video Applications
With the deployment of 3G wireless networks, carriers worldwide are finally able to offer a broad spectrum of mobile services that transcend voice. Specifically, 3G-324M networks support a range of video capabilities, leveraging the dedicated 64 kbps data connection between the network and the subscriber’s handset.
Emerging video services include music videos, sports shorts, information services, video messaging, and subscriber-to-subscriber video calls. The added value associated with these video capabilities results in higher average revenue per user (ARPU) and decreased subscriber churn.
Essential to the success of these new services is the total quality of the user’s experience. One critical factor in the user experience is the perceived responsiveness of the system, including something as basic as the time it takes to set up a video call—the time it takes from the moment the “Send” button is pressed to the moment the desired video starts to display on the subscriber’s handset.
Customers are accustomed to voice calls setting up in a matter of a couple of seconds—press the “Send” button, and by the time you raise the handset to your ear, you often already hear ringing. Video call setup needs to be perceived as just as fast as voice call setup, or subscribers will think video services are “slow.” That perception will, in turn, damage the potential penetration of these services, and reduce the returns service providers receive for their network investments.
The bottom line—fast call setup is critical to the success of mobile video services.
Improving Call Setup in 3G Mobile Networks
Call setup in 3G networks involves a complex process of messaging, “hand shaking,” and negotiations between 3G-324M terminals. 3G-324M, the umbrella standard for multimedia communication over circuit-switched networks, specifies using the H.245 protocol for much of the call setup negotiation. H.245 can require as many as 10 messages to establish a call (see Figure 1), each of which can introduce about 800 milliseconds of round trip delay. In an unaccelerated form, this signaling results in a video call setup time that can run up to 10 seconds and beyond—unacceptable to subscribers used to call setup times of one second or less.
Figure 1: Standard H.245 Handshaking During Call Setup
As one approach to speeding up the call setup process, accelerated H.245 signaling involves opening the logical channels and allowing audio and video to be transmitted before the MES and MSD messages are exchanged. However, the receiving terminal may reject that incoming media, in which case the sending terminal must try again with a different media type, negating much of the potential time savings of accelerating the signaling in the first place.
WNSRP for Faster Video Call Setup
A key reason the standard H.245 process is slow is that the messages are sent serially—the next message is not sent until its predecessor has been processed and acknowledged. In 2005, the ITU approved WNSRP (Windowed Numbered Simple Retransmission Protocol) as a way to significantly reduce call setup time. With WNSRP, H.245 messages can be sent rapidly, without waiting for confirmation. Missed messages are immediately detected by the receiving handset, and retransmission requests are automatically generated. WNSRP can reduce call setup time to closer to three seconds—significantly better than unaccelerated H.245, but still long enough to negatively affect the customer experience.
MONA Meets Subscribers’ Call Setup Time Requirements
In light of the progress made by WNSRP, but recognizing that further improvements were needed, in 2006 the ITU released the Media Oriented Negotiation Acceleration (MONA) standard, H.324 Annex K. MONA builds on H.245, but adds the concepts of “preference messages,” which are short messages that include signaling to accelerate the establishment of multimedia sessions.
Two key MONA messages include:
- Media Preconfigured Channels (MPCs) that describe which Media Preconfigured Channel configurations the MONA terminal is capable of using. This message is used to set up audio and video sessions using the most common codecs and configurations.
- Signaling Preconfigured Channels (SPCs) that designate whether the MONA terminal prefers negotiation of logical channels using the Signaling Preconfigured Channel, and, if Media Oriented Setup is used, the terminal’s multiplexer level preference. This is used to negotiate a wider range of possible session types (codecs and configurations).
A MONA-capable terminal begins establishing a session by transmitting the Preference Messages, which contain information about its channel establishment capabilities and preferences. Upon receiving each other’s Preference Messages, the two terminals execute a handshaking process that quickly establishes a video and audio channel between them. As a lower complexity fallback from this negotiation, accelerated H.245 signaling is also supported. And finally, for compatibility with legacy terminals, all MONA-compliant terminals can also quickly fall back to using standard, unaccelerated H.245 setup procedures.
The benefit of MONA is that call setup time is typically under one second. The customer experience is much improved, an outcome that could significantly reduce subscriber churn while increasing ARPU.
Summary
Mobile video services offer the possibility of significant new revenue streams to service providers, but only if the customer’s experience with the service is compelling. Key to that positive experience is fast call setup times, comparable to what subscribers are used to on voice calls. Defined by the ITU, MONA builds on traditional H.245 signaling to bring video call setup times under one second. Mobile handsets with MONA support will be coming to market soon. NMS will be supporting MONA in release 3.0 of Video Access (available on July 6), a technology platform for rapidly developing and deploying video applications in 3G mobile networks.
For more information on Video Access 3.0, contact your NMS sales representative .
By Brough Turner, SVP & CTO
The following article was featured in the May 2007 issue of Internet Telephony Magazine ( www.tmcnet.com ).
OK, I got your attention. Of course, service level agreements are widely available and that’s one form of quality of service (QoS). But the popular meaning is certain packets get priority over other packets and there is no such prioritization on the Internet backbone and very little anywhere in the Internet.
QoS at the Core of the Internet
Prioritization only matters when links are saturated. Once you get beyond the access network, every link in the Internet—local, regional, national, or international—is carrying multiplexed traffic from many, many users. Multiplexing many, bursty flows results in relatively predictable traffic. Traffic volumes vary by time of day, but links don’t saturate, except as a result of poor engineering or other link failures. Either case generates a rapid response from any ISP that expects to remain in business.
In short, except at the edges, i.e., the access network, Internet links may be heavily loaded but are not saturated. Combined with relatively high capacity links, this means typical delay variations are fractions of a millisecond and packet loss is negligible, i.e., best effort is good enough even for low-latency applications like voice telephony. Except during major failures, the effect of QoS in the Internet backbone is negligible.
Consumers won’t pay a premium for performance improvements they can’t see! They might be induced to pay for a brand (after all, people pay premium prices for branded water), but as yet no such brand has emerged. And if one does, consumers will be paying for that brand, not for QoS technology per se.
Broadband Access Links
The one place in the public Internet where limited, highly specific QoS measures make sense is at the consumer end of an asymmetric broadband access link. Typical residential DSL connections offer a few megabits per second (or less) to the home, but only a few hundred kbps from the home to the Internet. Unlike links between core routers, traffic on such access links is very bursty and bursts can saturate the link. If there are no active peer-to-peer applications then, most of the time, little or no traffic flows on most residential broadband connections. Suddenly, someone sends an email with an attached file or photo which saturates the outgoing link for many seconds as several megabytes of data go out at perhaps 250 kilobits (~31 kilobytes) per second.
Typical residential access links are narrow pipes where the cost of purchasing more capacity is prohibitive, if it’s possible at all. On the other hand, we can control the routing policy for what goes out on the link by deploying an appropriate residential router at our end of the broadband access link. As a first approximation, one would like to give priority to VoIP packets (and gamers may want to give priority to specific multi-player games). Simple priority is a good first step, but may not be enough on slow links.
Slow links have an added problem due to large packet serialization delay. VoIP packets are typically less that 150 bytes while a web page or email is typically delivered in ~1500 byte chunks. At 250 kbps, a 1500 byte packet takes ~50 milliseconds to pass over the link. If a VoIP packet arrives just after a 1500 byte packet has started, it doesn’t matter that the VoIP packet is priority to be sent next, it will have to wait for the current packet to complete. Intermittent 50 ms delays are handled by a jitter buffer at the other end of the VoIP connection, but only at the expense of an additional 50 ms of delay. If the uplink is slower than 250 kbps, serialization delays are even longer.
Luckily this is one place where priorities work and can be imposed. Indeed most consumer VoIP devices incorporate simple priority and some include the ability to fragment large packets (so as to reduce serialization delays). And, because it’s useful for both VoIP and gaming, this functionality is showing up in popular residential routers from Linksys, Netgear, and the like.
Brands Can Command a Premium, But Internet QoS Never Will
Individuals can benefit from simple priority queuing at their end of a broadband access link. But they are not going to pay for benefits they can’t see, so we’re unlikely to see prioritization from ISPs. Operators interested in premium services should focus on branding and perhaps on facilitating simple priority queuing in the access network.
Product News
A new CG 6060 Starter Kit, for sale in Europe and the Americas, is now available. This kit features the one T1/E1 version of the CG 6060 and comes with two incidents of technical support. This kit is available to new NMS customers and those migrating to the CG 6060 from the AG 4040, which is being discontinued (see article in May 2007 issue of Telecom Innovators News). Contact an NMS sales representative for additional information on this new kit, and other CG 6060 kits. Don’t know who to call—fill out this short inquiry form to have a NMS sales representative contact you.
Articles and Publications
The white paper, Introduction to VoiceXML, explores the advantages of VoiceXML from the developer’s perspective, cites examples of VoiceXML scripts for multimedia applications, and presents NMS Communications’ Vision VoiceXML Server, which allows voice application developers to rapidly develop and reliably deploy voice- and video-based applications for converged networks.
Read the abstract and get your copy of this white paper here.
|
Support Tip |
When loading Natural Access™ on SPARC® Solaris™ 10 machines, be sure to follow these steps before rebooting the machine. Install Natural Access as explained in the installation manual, downloadable from the NMS web site.
- Change the current directory to /usr/kernel/drv
- Locate the following NMS configuration files:
- aghw.conf
- cg6k.conf
- cx.conf
Note: Depending on the choices you made during the installation, all three configuration files may not be available.
-
Add the following line to the configuration files:
- ddi-forceattach=1;
- Reboot the machine
This will ensure you have a proper installation of the Natural Access Software package when using SPARC Solaris 10. |
Spotlight Events
Connect is an ecosystem event that provides the optimal setting for networking. You can talk directly to developers, operators, system integrators, equipment manufacturers, and other mobile industry professionals. Rest assured you can get some serious business done at an event like this. This year, events take place in Boston (USA), Guilin (China), and Madrid (Spain).
For 2007 we have selected four conference topics that we believe are vital issues our entire industry needs to contend with: the user experience, the importance of mobile communities, the growing practice of Mashups, and the role IMS plays in achieving service velocity. Guest speakers come from innovative companies and are key players in the ecosystem. NMS is honored that they have agreed to contribute to the event and knows they will play an important role in its success.
NMS is pleased to provide this industry forum that is free and open to all who wish to participate. Our hope is that the event creates educational and the networking benefits that will ultimately turn into business and development success for everyone.
For more information on our Connect series of events, speaker biographies, and to register, please visit our web site at: www.nmscommunications.com/connect2007/About/default.htm.
|
Americas October 2-3 Boston, MA, USA |
Asia October 17-18 Guilin, China |
Europe November 7-8 Madrid, Spain |
If you missed our web seminar on June 26, and are interested in how you can generate new revenue streams and develop and deploy innovative mobile video applications for 3G networks worldwide, then check out our new webinar archive!
Maggie Smith, Director of Product Marketing, and John Nicol, Director of Technology, Video Products, spoke about “Mobile Video Made Easy.” In their presentation, they discussed how developers are facing stringent performance requirements, multiple interoperability issues, and a variety of other technical challenges. Fortunately, developers do not have to face these challenges on their own. Advanced platforms and toolkits from industry leaders, like NMS, shield them from the technical complexities of mobile video and offer them the advantages they need to produce compelling and profitable applications.
Click here to access the archived recording and or PDF today!