VoiceXML Developer Series: A Tour Through VoiceXML, Part VI

Sunday Oct 6th 2002 by Jonathan Eisenzopf
Share:

In this edition of the VoiceXML Developer, we're going to continue our focus on grammars by examining the other widely used VoiceXML 1.0 grammar format, JSFG.

In the last edition of the VoiceXML Developer, we created a pizza pie ordering system for Joe's Pizza Palace, which utilized the GSL grammar format. In this edition, we're going to continue our focus on grammars by examining the other widely used VoiceXML 1.0 grammar format, JSFG.

Overview

JSGF stands for Java Speech Grammar Format and was developed by Sun Microsystems. While the top three voice portal providers, Tellme, BeVocal, and Voxeo, all use the Nuance GSL format, the IBM Voice Server uses JSGF for grammars. As with GSL, VoiceXML can refer to a grammar in an external file or specify the grammar in the VoiceXML document inside the <grammar> element. In fact, both GSL and JSGF are rule based grammars. The basic syntax of a JSGF grammar rule is:

<rule> = token_string;

where the rule name is surrounded by < and > characters and the tokens representing the input to match is contained on the right hand side of an equal sign followed by a semi-colon.

Internal Grammars

You can directly embed JSGF grammars within the <grammar> element. Below is an example of an inline grammar within a VoiceXML document that will match the utterance, "I like pie":

<grammar type="text/jsgf">I like pie</grammar>

As with GSL, when including a grammar within a form <field> element, the grammar will return the match, and set the value for the field.

<?xml version="1.0"?>
<vxml version="1.0">
 <form id="clown">
<block name="welcome">Hi, I'm commie the clown.</block>
<field modal="false" name="greeting">
<grammar type="text/jsgf">hi | hello 
| howdy | yo</grammar>
    </field>
  </form>
</vxml>

File grammars

A JSGF file starts with a JSGF declaration followed by the name of the grammar in the file. The file can contain multiple rules.

#JSGF V1.0
grammar greeting;
<greeting> = hi | hello | howdy | yo;

The example above contains the same grammar as the previous inline grammar example except for the fact that it is contained in an external file. The modified VoiceXML file, which refers to the external grammar, is below:

<?xml version="1.0"?>
<vxml version="1.0">
  <form id="clown">
    <block name="welcome">Hi, I'm commie the clown.</block>
    <field modal="false" name="greeting">
<grammar type="application/x-jsgf" 
src="greeting.jsgf" />
    </field>
  </form>
</vxml>

Using JSGF lists

We can define a list of selections by separating them with the | character.

<grammar type="text/jsgf">small | medium | large</grammar>

In the example above, the grammar will match small, medium, or large. We can also make a word optional by surrounding it with a pair of square brackets:

<grammar type="text/jsgf">small 
| medium | [real] large</grammar>

The example above will match, small, medium, large, or real large. If we were to reuse the Joe's Pizza Palace example from the last article, and the grammar above was to match the pizza size, we may want to allow users to provider alternate utterances for the choices. We can do this by grouping a token with a set of parenthesis.

<grammar type="text/jsgf">
(small|little)|(medium|regular)|[real](large|big)
</grammar>

Now a customer can say little or small, medium or regular, and large or big or real large or real big.

Grammar rules may contain other grammar rules

A grammar can be made up of other grammar rules, which allows us to create complex grammars by building larger grammar rules that are based on different rule subsets. For example, a grammar that matches a pizza order that's contained in a external grammar is below:

1  #JSGF V1.0
2  grammar phone;
3  <order> = <pizzaSize> 
<pizzaType> [pizza] [with] <topping>+;
4  <pizzaSize> = small | medium | large;
5  <pizzaType> = [hand] (tossed | stretched | thrown) 
6                | [deep] (dish | chicago)
7                | stuffed [crust];
8  <topping> = [and] pepperoni
9                | [and] olives
10               | [and] green peppers
11               | [and] mushrooms
12               | [and] pineapple
13               | [and] anchovies;

The grammar above contains 4 grammar rules. The <order> rule is composed of several other rules that exist in the same JSGF file. These are <pizzaStyle>, <pizzaType>, and <topping>. A sample utterance that would match this grammar is listed below:

I would like to order a small deep dish pizza with olives, 
and pepperoni, and anchovies.

So a JSGF grammar can contain multiple words and phrases represented by subgrammars that, when combined, can match a complex utterance. The grammar above also contains an operator we haven't talked about yet. The last part of the <order> grammar rule uses the <topping> subgrammar followed by a + character. In GSL, we used the same character to match one or more toppings, though it was placed before the subgrammar. Placing a + character after a word, phrase, grouping, or subgrammar, tells the speech recognition engine to look for one or more occurrences of the grammar. In this case, we need to match the list of toppings that the customer would like on their pizza.

Below is a simple VoiceXML fragment that uses the external JSGF grammar listed above.

1  <?xml version="1.0"?>
2  <vxml version="1.0">
3    <form id="pizzaOrder">
4      <block>Hello, thank you for calling Joe's 
Pizza palace. May I take your order?</block>
5      <field name="order">
6        <grammar src="pizzaOrder.jsgf" 
type="application/x-jsgf" />
7       </field>
8     </form>
9   </vxml>

Conclusion

The JSGF format is, in my personal opinion, easier to work with for developing simple grammars. However, I prefer GSL, because it is more widely supported and contains more features for building large and complex grammars. That's not to say that you can't do the same with JSGF. IBM's platform is more than capable of performing the same interactions as a platform from Nuance or Speechworks. In the next

About Jonathan Eisenzopf

Jonathan is a member of the Ferrum Group, LLC based in Reston, Virginia that specializes in Voice Web consulting and training. He has also written articles for other online and print publications including WebReference.com and WDVL.com. Feel free to send an email to eisen@ferrumgroup.com regarding questions or comments about the VoiceXML Developer series, or for more information about training and consulting services.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved