Programmer to ProgrammerTM  
Wrox Press Ltd  
   
  Search ASPToday Living Book ASPToday Living Book
Index Full Text
  cyscape.com

ASPToday Home
 
 
Home HOME
Site Map SITE MAP
Index INDEX
Full-text search SEARCH
Forum FORUM
Feedback FEEDBACK
Advertise with us ADVERTISE
Subscribe SUBSCRIBE
Bullet LOG OFF

                         
      The ASPToday Article
May 17, 2001
      Previous article -
May 16, 2001
  Next article -
May 18, 2001
 
   
   
   
Adding Interactive Voice Response (IVR) to your web applications Part 1   Lance Luttschwager  
by Lance Luttschwager
 
CATEGORIES:  XML/Data Transfer, Other Technologies  
ARTICLE TYPE: Overview Reader Comments
   
    ABSTRACT  
 
Article Rating
 
   Useful
  
   Innovative
  
   Informative
  
 45 responses

Recently WAP and PDAs have been touted as a way to keep in touch with your internet resources while away from your regular computer; but for a variety of reasons, wireless devices have yet to live up to expectations. In the meantime, the use of IVR and a standard touch-tone phone continues to grow, aided by the introduction of two new technologies that bring IVR to the web developer - VoiceXML and Microsoft's Web Telephony Engine (WTE). In this first article of a series, Lance Luttschwager introduces these technologies and shows us how we can set up a simple interactive page accessible over an ordinary phone.




   
                   
    Article Discussion   Rate this article   Related Links   Index Entries  
   
 
    ARTICLE

Introduction

In recent years wireless devices ranging from internet connected PDAs to WAP enabled cell phones have been touted as the way to stay connected to internet resources while away from your regular computer. Unfortunately, the use of wireless devices has not yet lived up to expectations for a variety of reasons. In the meantime the use of Interactive Voice Response (IVR) systems and your touch-tone phone is another mechanism for accessing internet resources while on the road and is attracting a lot of attention these days. While IVR systems have been around for quite some time, there are two technologies that promise to put IVR capability in the hands of web application developers. These technologies are VoiceXML and Microsoft's Web Telephony Engine (WTE). Both technologies rely heavily on the HTTP protocol and traditional web application development techniques to develop full featured IVR systems.

Topics Discussed in Today's Article

This is the first in a series of articles on IVR technology implementation. This article will provide an overview of the new IVR technologies and guide you through the setup and testing of a simple IVR application with each technology. Future articles will lead you through the development of more advanced applications. Today's article will focus on:

Overview of IVR

Most people have been exposed to IVR in some fashion or another whether it is a voice messaging system in large organizations that let you transfer to a specific department or employee by dialing their n-digit extension, automated flight arrival-departure status systems, automated bank teller applications that let you transfer money between accounts, on-line support / troubleshooting tools, or other similar applications. In some cases the IVR application is wholly self-contained (such as automated call transfers) while other IVR implementations tie into other backoffice applications to access enterprise information (automated flight arrival-departure status systems, account balances, etc.). In the past, IVR systems have been developed completely from scratch using C++ or other programming languages, or developed through a third-party proprietary IVR system. While these solutions worked, it was hard to find experienced IVR developers, and systems were notoriously expensive to develop and maintain.

Two Exciting new IVR technologies

There are two exciting new technologies that promise to make the development, implementation, and maintenance of IVR systems significantly easier than traditional technologies by relying on the skills already possessed by web application developers. The first is called VoiceXML and is being developed by a consortium of technology companies led by AT&T, IBM, Lucent Technologies, and Motorola. VoiceXML Version 1.0 was released in March 2000 and was acknowledged by W3C in May 2000 [see http://www.w3.org/voice/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ for more information].

The second new technology is Microsoft's Web Telephony Engine. Microsoft has taken a slightly different approach in that their WTE solution uses standard html as the cornerstone of the IVR language but adds "attributes" to many html tags to allow the WTE to properly render the page using audio files, text-to-speech synthesis, and speech recognition. This approach will conceivably allow the web developer to create one page that can be both rendered for visual display via a web browser and rendered for IVR through WTE [see http://msdn.microsoft.com/library/psdk/webte/wtestartpage_61et.htm?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ for more information].

Both IVR technologies basically place a "rendering engine" between the incoming phone call and the hosting web server. When a call is received, the rendering engine is responsible for retrieving the appropriate 'homepage' and then rendering it to the caller using the rules of the language. The following figure shows the logical diagram of a web based IVR applications.

image1

VoiceXML and WTE both support scripting that can be used to validate input or perform complex processing similar to a regular internet browser (VoiceXML uses Javascript while WTE supports both Javascript and VBScript).

Some IVR Terms

Before we jump into IVR code, I think it would be useful to spend a minute describing some key concepts of IVR. First, IVR systems "speak" to callers by:

With TTS you simply send the rendering engine the text you would like 'read' to the reader. From an application developers standpoint it is much easier to develop and maintain a site using TTS because you do not have to record and manage the many .wav files needed for a full fledged IVR application and you don't need to rerecord the files each time you redesign the system. The drawback with TTS is that in the best case the TTS engine sounds a little weird and in the worst case it mutilates words to the point that they are unrecognizable to the caller (TTS engines have a particularly difficult time with names of individuals and companies). For this reason, most IVR systems rely heavily on a bank of prerecorded audio files. Nevertheless, TTS is an excellent choice during application design (when you are perfecting the application's flow) and is the only real option when providing callers with 'dynamic' content that is extracted from a database or other data store.

Caller input can also be performed in two ways. The first is through the use of the telephone's touch tone keypad [more formally known as dual tone multi-frequency (DTMF) digits]. The second option is to use voice recognition software. Similar to the TTS vs. Audio file debate, there are pros and cons to each approach of obtaining input from callers. The benefit of voice recognition software is that it is generally easier and faster than entering data using the DTMF keys. The drawback to voice recognition is that it is very error prone unless you limit the available inputs by defining an acceptable 'grammar' for each input field. A 'grammar' is basically a list of acceptable words or phrases that are satisfactory responses to the prompt and will be described in more detail in a subsequent article.

Getting started with VoiceXML

[Note: This example will walk you through the process of testing VoiceXML documents using tellme.com's infrastructure (internet connections, servers, phone lines etc.) and requires no hardware or software on your end (other than a internet connected web browser for authoring pages and a touch-tone telephone for testing).]

In this section we will investigate the first IVR technology by establishing a design/test environment for VoiceXML and demonstrating its use with a simple one page application.

Establishing a test account using TellMe.com's VoiceXML Studio

One of the major benefits of developing VoiceXML documents over WTE documents is that you can obtain free access to a VoiceXML engine at TellMe.com allowing you to test pages without any investment in hardware or software. To do this you must first register in Tellme.com's developer studio found at http://studio.tellme.com/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ. A registration e-mail will be sent to your address containing the developer ID you selected and a 4 digit pin that will allow you to log onto the developer studio and to test applications using their toll free number. The developer studio allows you to:

Simple VoiceXML Document

This article is not intended to be a tutorial on the VoiceXML so I am only going to present a simple yet fully functional VoiceXML document [see http://studio.tellme.com/voicexmlref/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ for more information on VoiceXML]. This document: 1) asks the user what time of day it is in their part of the country, 2) delivers a short greeting appropriate for that time of day, 3) and asks the user if they would like to listen to the greeting again or return to the main menu which starts the process over again. The intent of this code snippet is to show the most basic functions of VoiceXML and to test our development environment.

VoiceXML_greeting.asp:

<vxml>
    <menu id="three_choice_menu">  
    <prompt>   
      <audio>In your part of the country is it currently morning, afternoon, or evening.</audio>  
    </prompt>  
    <choice next="#morning">morning</choice>  
    <choice next="#afternoon">afternoon</choice>  
    <choice next="#evening">evening</choice>  
    <default>
      <reprompt/>
    </default> 
    <noinput>
      <audio>Sorry, I didn't hear you.  You must respond to this prompt 
               with either "morning", "afternoon", or "evening."</audio>
      <reprompt/> 
    </noinput>  
    <nomatch>
      <audio>Sorry, I didn't understand what you said. Valid responses 
              to this prompt are: "morning", "afternoon", or "evening."  
Please try again.</audio> 
      <reprompt/>
    </nomatch>   
  </menu>  

  <form id = "morning">  
    <block>   
      <audio>Good Morning!</audio>   
    </block>
    <field name="ListenAgain" timeout="10"> 
      <prompt>Would you like to listen to this greeting again?</prompt> 
        <grammar>
        <![CDATA[
           YES_NO
         ]]>
      </grammar>
      <filled>   
        <if cond="ListenAgain == 'yes'"> 
          <goto next= "#morning"/>
        <else/>
          <goto next= "#three_choice_menu"/>
        </if> 
      </filled>  
    </field> 
  </form> 

  <form id = "afternoon">  
    <block>   
      <audio>Good Afternoon.</audio>   
    </block> 
    <field name="ListenAgain2" timeout="10"> 
      <prompt>Would you like to listen to this greeting again?</prompt> 
        <grammar>
        <![CDATA[
           YES_NO
         ]]>
      </grammar>
      <filled>   
        <if cond="ListenAgain2 == 'yes'"> 
          <goto next= "#afternoon"/>
        <else/>
          <goto next= "#three_choice_menu"/>
        </if> 
      </filled>  
    </field> 
  </form> 

  <form id = "evening">  
    <block>   
      <audio>Good Evening.</audio>   
    </block> 
    <field name="ListenAgain3" timeout="10"> 
      <prompt>Would you like to listen to this greeting again?</prompt> 
        <grammar>
        <![CDATA[
           YES_NO
         ]]>
      </grammar>
      <filled>   
        <if cond="ListenAgain3 == 'yes'"> 
          <goto next= "#evening"/>
        <else/>
          <goto next= "#three_choice_menu"/>
        </if> 
      </filled>  
    </field> 
  </form> 
</vxml>

In subsequent articles we will expand on this XML document and provide a more thorough description of the document flow and the programming elements/attributes.

Testing a simple VoiceXML application

To test your VoiceXML document simply log on to the developer studio (using the DeveloperID and PIN obtained above) and cut and paste the code shown in the previous section into the studio's scratchpad and press 'update' to save it on the Tellme.com server. Then dial 1-800-555-VXML (or +1 408-678-4465 if outside the US) and say (or press) your developer ID and PIN number at the appropriate prompt. Tellme's VoiceXML engine will retrieve the document from your scratchpad and render it using text-to-speech synthesis.

image2

Other VoiceXML Resources

Another resource that appears to have similar functionality to the TellMe Studio is BeVocal's Café, which can be found at http://cafe.bevocal.com/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ. If you would like to install a VoiceXML server on your own infrastructure you can obtain a beta version of IBM's Voice Server at http://www.alphaworks.ibm.com/tech/voiceserversdk/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ (I have not worked with this software so I cannot comment on its features or limitations of the beta version).

Getting started with Web Telephony Engine (WTE)

[Note: This example will walk you through the process of testing WTE documents using Microsoft's Web Telephony Engine. It requires you to establish your own infrastructure (internet connection, servers, and telephone lines, etc.) for testing as well as the software needed to author the asp pages.]

In this section we will investigate the second IVR technology by establishing a design/test environment for WTE and demonstrating its use with a simple one page application. For general information about WTE see http://msdn.microsoft.com/library/psdk/webte/wtestartpage_61et.htm?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

Establishing a test environment for WTE applications

Unlike VoiceXML, I couldn't find any free developer resources that allow us to design and test WTE documents. Instead, this article will guide you through the process of obtaining and installing Microsoft's WTE on your own server infrastructure. The basic system requirements for WTE are:

For a complete listing of requirements please review Microsoft's literature.

As a reference, I performed all of my development and testing for this article using an IBM Thinkpad 600e with a Pentium II - 366 processor, 160M RAM, and integrated voice modem running Windows 2000 server, IIS 5.0, SQL 7.0, and the WTE.

The basic steps for installing and setting up the WTE are:

The following section provides more detailed information about each step.

Downloading and install WTE

The WTE installation software and support files are part of the platform Software Development Kit (SDK) and are available free from Microsoft's ftp site at ftp://ftp.microsoft.com/developr/platformsdk/july2000/common/redist/wte/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ. Copy all of the files to the local hard drive of the computer on which WTE will be installed. To start the setup process simply use Windows explorer to navigate to the folder containing the WTE files and double click on the wtesetup.exe file, ensuring you have first stored the WTE configuration information in Active Directory using the wteimp.exe file. The setup process will unpack the support files and start the installation. During installation you are presented with a few dialog boxes that: 1) ask where you would like to install the files (I used the default location), 2) ask if you would like the computer to be part of a WTE array or operate in stand-alone mode (use stand-alone mode for this demonstration), and 3) ask for the NT account that will be used by WTE callers (I used local system account for my test environment). The whole WTE software installation process should takes less than 5 minutes once you have obtained the files.

Download and install a SAPI engine

If you want your WTE application to include voice recognition and text-to-speech synthesis you must download and install a SAPI 4.0a compliant speech recognition engine [note: Microsoft recently introduced SAPI 5.0 but it is not backwards compatible with SAPI 4.0 so you cannot use it with the existing WTE engine]. You can download Microsoft's SAPI 4.0a SR engine by navigating to http://www.microsoft.com/downloads/release.asp?ReleaseID=26299&WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ and clicking on the SAPI4SDKSUITE.exe and SAPI4SDK.EXE link near the bottom of the page. Copy these to your WTE server and install both using the same procedure as the WTE engine.

Configuring MS WTE

The Microsoft Management Console (MMC) is used for administering the WTE. For those not familiar with MMC, the MMC is becoming Microsoft's standard tool for administering services and consists of a two-pane console with a navigation tree in the left window and the 'details' pane on the right (similar in appearance to Microsoft's Windows Explorer). For WTE, the MMC contains 5 nodes in the tree for administering: 1) applications, 2) addresses, 3) address groups, 4), logging, and 5) servers. For our demonstration we will only be concerned with the 'applications' and 'addresses' nodes.

image3

The 'addresses' node basically contains a listing of modems installed on your WTE server. Click on the ' addresses ' icon and ensure that your modem appears in the 'details' pane on the right. If your modem does not show up you must check the modem configuration and possibly reinstall the software drivers.

Although there are several configuration options for applications, the basic function of the 'applications' in the MMC is to relate a modem with a web page that will be used when the WTE answers a call on that modem. To create a new application simply right click on the 'applications' node of the navigation tree then click on pop-up menu's ' New | Application ' and follow the setup wizard which will prompt you for: 1) an application name and description (I used ASPToday for the name and left the description blank, 2) the application's home page (I installed the test application in folder c:\inetpub\wwwroot\asptoday\IVR\ on the same server running WTE so I entered http://127.0.0.1/asptoday/ivr?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ as my home page), and 3) for the addresses (read modems) that should be mapped to this application (I mapped all modems to the test application).

Once you have created the application you should open its properties box by right clicking on the application name (in the tree view) and clicking on properties to verify the name and URL are correct.

image4

Now navigate to the ' Addresses ' tab to ensure the modem you will be using appears in the ' Associated addresses ' window.

image5

Next, navigate to the ' Text-To-Speech ' tab and click on the ' Enable text-to-speech engine '.

image6

Finally, navigate to the ' Speech ' tab and click on the ' Enable Speech Recognition engine '.

image7

When you have made all of these changes click on ' Apply ' and ' OK ' to close the properties window.

Simple WTE Document

As stated previously, the WTE engine simply renders a standard html document (with optional WTE enhancements) by basically reading the page to the author and waiting for input from the author when necessary (when it encounters a navigation menu or a form requiring user input). The following code snippet produces the same output as the VoiceXML document shown earlier in this article. As you can see, this document is very similar to a standard html document. This is not the place for a full list of tags used with the WTE, but for a description of the html extensions used with WTE please see http://msdn.microsoft.com/library/psdk/webte/wteauthor1_1ktq.htm?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

<html>
   <body>
      <a id="three_choice_menu"></a>
      In your part of the country is it morning, afternoon, or evening?
      <menu>
         <a href="#morning" Grammar="'morning'"></a>
         <a href="#afternoon" Grammar="'afternoon'"></a>
         <a href="#evening" Grammar="'evening'"></a>
      </menu>

      <a id="morning"></a>
      Good morning.  Would you like to here this greating again? 
      <menu>
         <a href="#morning" Grammar="'yes'"></a>
         <a href="#three_choice_menu" Grammar="'no'"></a>
      </menu>

      <a id="afternoon"></a>
      Good afternoon.  Would you like to here this greating again? 
      <menu>
         <a href="#afternoon" Grammar="'yes'"></a>
         <a href="#three_choice_menu" Grammar="'no'"></a>
      </menu>

      <a id="evening"></a>
      Good evening.  Would you like to here this greating again? 
      <menu>
         <a href="#evening" Grammar="'yes'"></a>
         <a href="#three_choice_menu" Grammar="'no'"></a>
      </menu>

   </body>
</html>

Testing MS WTE Installation

To test you will need two working phone lines, the first connected to the WTE server providing access to the IVR application, the second for placing the call and testing the application. To test the application plug the first line into the modem, then use the second line to call the first number (using the appropriate phone number). After about two rings the WTE engine should answer the call and start 'reading' our test page.

The only problems I have ever had with the WTE were traced back to modem problems or SAPI engine installation. If you have trouble with the test application test the modem and SAPI engine with other applications installed on your computer or go through Microsoft's troubleshooting guides.

Conclusion

The use of IVR technology will add an exciting new dimension to web application development and promises to add a low cost mechanism for accessing web based resources while away from your desktop computer. I hope this article has shed some light on these technologies and given you some ideas on how you could integrate IVR into your designs. In the next article in this series, we will get a glimpse of the potential of VoiceXML by developing an interactive, voice activated diary tool.

Links

VoiceXML Forum
http://www.voicexml.org/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

W3C VoiceXML
http://www.w3.org/Voice/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

Tellme Login
http://studio.tellme.com/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

VoiceXML Reference
http://studio.tellme.com/voicexmlref/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

IBM Voice Server Beta
http://www.alphaworks.ibm.com/tech/voiceserversdk/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

WTE Info
http://msdn.microsoft.com/library/psdk/webte/wtestartpage_61et.htm?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

WTE Download
ftp://ftp.microsoft.com/developr/platformsdk/july2000/common/redist/wte/?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

SAPI 4.0a Download
http://www.microsoft.com/downloads/release.asp?ReleaseID=26299&WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

WTE Tags List
http://msdn.microsoft.com/library/psdk/webte/wteauthor1_1ktq.htm?WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ

 
 
   
  RATE THIS ARTICLE
  Please rate this article (1-5). Was this article...
 
 
Useful? No Yes, Very
 
Innovative? No Yes, Very
 
Informative? No Yes, Very
 
Brief Reader Comments?
Your Name:
(Optional)
 
  USEFUL LINKS
  Related Tasks:
 
 
   
 
 
       
  Search the ASPToday Living Book   ASPToday Living Book
 
  Index Full Text Advanced 
 
 
       
  Index Entries in this Article
 
  • .wav files
  •  
  • configuring
  •  
  • description
  •  
  • HTML
  •  
  • installing
  •  
  • Interactive Voice Response
  •  
  • IVR
  •  
  • limitations
  •  
  • MMC
  •  
  • SAPI
  •  
  • testing
  •  
  • text-to-speech
  •  
  • TTS
  •  
  • VoiceXML
  •  
  • VoiceXML documents
  •  
  • VoiceXML engine
  •  
  • Web Telephony Engine
  •  
  • WTE
  •  
  • WTE documents
  •  
  • WTE, configuring
  •  
  • XML
  •  
  • XML documents
  •  
     
     
    HOME | SITE MAP | INDEX | SEARCH | REFERENCE | FEEDBACK | ADVERTISE | SUBSCRIBE
    .NET Framework Components Data Access DNA 2000 E-commerce Performance
    Security Admin Site Design Scripting XML/Data Transfer Other Technologies

     
    ASPToday is brought to you by Wrox Press (http://www.asptoday.com/OffSiteRedirect.asp?Advertiser=www.wrox.com/&WROXEMPTOKEN=477200Zf3odrbcTzRHcx5n8QjQ). Please see our terms and conditions and privacy policy.
    ASPToday is optimised for Microsoft Internet Explorer 5 browsers.
    Please report any website problems to webmaster@asptoday.com. Copyright © 2002 Wrox Press. All Rights Reserved.