VoiceXML Developer Series: A Tour Through VoiceXML, Part XI

by Jonathan Eisenzopf

In this edition of the VoiceXML tour, we will develop the first three dialogs that will play a greeting, ask for a phone number and look up the customer's address in an Access database.

In this edition of the VoiceXML tour, we will develop the first three dialogs that will play a greeting, ask for a phone number, and look up the customer's address in the Access database.


Well, we've completed our design work for the pizza ordering application, so it's time to get started with developing the static and dynamic VoiceXML application. I've chosen to use IIS and ASP to serve up the dynamic content for this application. The ASP language I've chosen to use is PerlScript, which provides more advanced text process features than other languages. These features will come in handy as you'll see later on. To develop the VoiceXML content, I used Nuance V-Builder to quickly prototype the dialogs based upon our design, and then used notepad to add the ASP code into the VoiceXML dialog files.

Before we dive into the code, let's revisit our design. We have identified six VoiceXML files and scripts that we need to develop. Each file contains zero or more dialogs. We also have detailed dialog flow diagrams for each, which will help us prototype and develop the VoiceXML content. You should probably take another look at the high level architecture diagram again as a reference. We will be developing the first three files in the diagram. These are main.vxml, telephone_number.vxml, and validate_phone_number.asp. As you can imagine, the last file will contain our PerlScript ASP code.


This file will be executed first when customers call into the system. In addition to playing the greeting, it will store a number of application-level variables that we will refer to throughout the application. That means that main.vxml will be our application root, which all other VoiceXML files will refer to.

When a customer calls, they will hear, "Thank you for calling Frank's Pizza Palace".

1  <?xml version="1.0"?>
2  <vxml version="1.0">
3    <var name="phone_number" />
4    <var name="address" />
5    <form id="greeting">
6      <block name="play_greeting">
7        <prompt bargein="false">
8          <audio src="../prompts/greeting.wav" />
9        </prompt>
10        <goto next="telephone_number.vxml" />
11      </block>
12    </form>
13  </vxml>

The very first thing we do on lines 3 and 4 is initialize the phone number and address variables. Other dialogs will set these values. Keeping them at an application level makes the values available to all the other dialogs. Line 8 plays our pre-recorded greeting and line 10 transitions to the next dialog, telephone_number.vxml .


This dialog will capture and confirm the customer's phone number and submit it to validate_phone_number.asp for validation. In the original dialog flow diagram, I had split this file into two separate forms; telephone_number and confirm_phone_number. I decided that it would be easier to roll all of the logic into a single form, so I eliminated the confirm_phone_number form. The dialog below also refers to an external grammar on line 5 named PHONE.grammar which uses the PHONE rule. According to the type attribute, this is a GSL grammar, which means it should work on most Voice ASP platforms. This is the same grammar that was used in a previous example.

view example 2

Next, we play the prompt on line 7, "May I have your phone number please", which has been pre-recorded and is contained in the phone_number.wav file. The resulting utterance will fill the phone_number field. When the field is filled, it executes the assign statement on line 10, which assigned the phone number as an application variable, also named phone_number. This will make this value available to all other dialogs in the application that specify main.vxml as the application root in the application attribute of the <vxml> element (see line 2).

Next, on lines 13 through 19, we play back the number that the ASR system recognized and which is now assigned as the application variable phone_number. Users will hear, "I heard xxxxxxxxxx. Is this correct?" where xxxxxxxxxx is the phone number that was recognized.

To catch the user's response to this yes or no question, we transition into a <subdialog> element on line 20, which refers to the yes_or_no.vxml dialog file and set the return variable that will be set as confirm. This subdialog is actually used in several places, so I've included it below:

1  <?xml version="1.0"?>
2  <vxml version="1.0">
3    <form id="confirm">      
4 <field name="answer" type="boolean?y=1;n=2">
5        <filled mode="any">
6          <return namelist="answer" />
7        </filled>
8      </field>
9    </form>
10  </vxml>

Line 4 of the yes_or_no.vxml dialog above sets the field type to boolean and can accept a verbal "yes" or "no" answer as well as DTMF 1 for yes and DTMF 2 for no. Once the user has answered yes or no, the value is returned back to the caller on line 6.

Now, back in the main dialog above, when yes_or_no.vxml returns a value, we test the value to see whether it's true or false on line 22. if it is true, this means the customer said yes or pressed 1 on their phone keypad. In either case, we submit the phone number to the validate_phone_number.asp script with the <submit> element on line 23. If the user says no or presses 2, then we return them back to the top of the dialog on line 25, where they are prompted for their phone number again.


This dialog queries the Access database for an address matching the phone number that the user entered and confirmed in the telephone_number.vxml dialog. If we find a matching record, we transition to the confirm_address form, otherwise, we transition to the record_address form, which prompts the user to say their address.

This VoiceXML file is unlike the others we've develop so far, because it intermingles VoiceXML with PerlScript ASP code. On line 3, we specify that our ASP language in this script will be PerlScript. On lines 7 and 8, we create a new instance of an ADO object, and connect to the access database using the Microsoft Access database driver. On line 11, we retrieve the phone_number variable from the Request object. If you are unfamiliar with ASP, the Request object contains all of the field values that were submitted from a form. In our case, the phone_number field was submitted to this script by the telephone_number.vxml form.

Lines 12 through 22 convert the phone number, which may have been passed as a set of words rather than digits, is converted to numbers. Perl is well known for its regular expression and text processing capabilities, so it's well suited for creating complex VoiceXML applications, which require many different types of text and language processing from grammars and prompts to parsing input and output values. In fact, on line 36, you'll see another Perl regular expression that searches for a number in the address variable and encloses it in a <sayas> element so that the TTS engine will pronounce the number portion of the address as digits rather than a large number.

Lines 25-27 execute the select statement that searches for an address record in the database that matches a phone number. Lines 30-39 contain an if/else conditional expression that basically says, if we have a matching address, create a string that will assign the value to the application.address variable and then transition the user to the confirm_address form, otherwise, transition to the record_address form. In other words, if we find an address, we want to have the user confirm that it is the correct address. If we do not have an address on file, we want to have the customer tell us their address so we can save it for next time. Line 40 ends the main block of PerlScript code.

view example 3

Line 43 prints the value of the $string variable, which controls which form the user will be trasitioned to. If the customer has an address, they are transitioned to line 47, where we prompt the user (lines 49-53) to confirm their address, i.e. "I have your address as, 555 green wood drive. Is this correct?". We re-used the yes_or_no.vxml subdialog here, just as we did in telephone_number.vxml. If the use confirms their address, we transition to the take_order.vxml dialog, which will prompt the customer for their order. If they do not confirm their address, then we transition the user on line 60 to the record_address form starting on line 66.

The user will be sent to the record_address form if they say no when asked if their address is correct, or if the customer does not have an address on record. Either way, we need an address for the customer. Since we cannot accurately recognize a full address, we have to record it to a wav file with the <record> element on line 68. The audio content is submitted to the save_address.asp script where their database record is created or updated, and the audio file is saved to disk. An operator will need to go through the database on a regular basis and manually fill the address field in the Access database based on the recordings.


Well, we've completed the first three VoiceXML dialog files. In the next article, we will finish the last three dialog files in the application. Also, a few notes on what we've done so far. First, recognizing spoken numbers is not always 100% accurate. If you experience problems, you may want to convert to using DTMF tones to capture the number, which is almost always correct the first time. Also, we will need to go back and add event handlers and error checking once we've completed the initial version of the application.

About Jonathan Eisenzopf

Jonathan is a member of the Ferrum Group, LLC based in Reston, Virginia that specializes in Voice Web consulting and training. He has also written articles for other online and print publications including WebReference.com and WDVL.com. Feel free to send an email to eisen@ferrumgroup.com regarding questions or comments about the VoiceXML Developer series, or for more information about training and consulting services.

This article was originally published on Friday Oct 11th 2002
Mobile Site | Full Site