VoiceXML 2.0 Grammars, Part II

Thursday Nov 7th 2002 by Jonathan Eisenzopf
Share:

In Part II our introduction to VoiceXML 2.0 grammars, we will learn how to use tokens, rules and operators to create grammars that match natural utterances.

In Part II of our introduction to VoiceXML 2.0 grammars, we will learn how to use tokens, rules and operators to create grammars that match natural utterances.

Tokens

Grammars match spoken words or touch-tone digits. These words are referred to as tokens. The simplest grammars are token strings composed of one or more words. For example, we might create an inline grammar that matches my first and last name.

ABNF
<grammar>Jonathan Eisenzopf</grammar>
XML
<grammar>
  <token>Jonathan Eisenzopf</token>
</grammar>
<grammar>
  <item>Jonathan Eisenzopf</item>
</grammar>

Inline grammars are embedded within VoiceXML code instead of external files. By default, words that aren't marked with a special grammar symbol (ABNF) or are included in a grammar element (XML) are treated as tokens. Therefore, we could have excluded the <token> element in the XML grammar above and it would have performed exactly the same.

In the example above, my full name is a sequence of tokens (Jonathan followed by Eisenzopf). The Automatic Speech Recognition (ASR) engine will only recognize my full name in the specified order.

I've also included an optional method for encapsulating a sequence of tokens using the <item> element. The end result is identical to using the <token> element, however, I wanted to show it to you now because we will be using the <item> element later in the tutorial.

Rule References

Grammars often consist of sub-grammars. This allows us to define re-usable grammar components, such as a phone number. These sub-grammars are included into other grammars via a rule reference. A rule reference can point to a local grammar, or an external grammar rule contained in another file, or even on another server on the Internet. For example, we may want to create a sub-grammar that contains all possible first names and include it in a top-level grammar:

ABNF
<grammar>$firstName Eisenzopf</grammar>

XML
<grammar>
<ruleref uri="#firstName"/> Eisenzopf
</grammar>

The local sub-grammar named firstName is being referenced in the grammar above. The sub-grammar is local because it's contained in the same grammar file, however, we could also have referenced the sub-grammar if it were in a different file by specifying the full URI of the grammar file:

ABNF
<grammar>$(http://grammars.com/name.gram#firstName) 
    Eisenzopf</grammar>

XML
<grammar>
<ruleref 
    uri="http://grammars.com/name.grxml#firstName"/> 
    Eisenzopf
</grammar>

Grammar Rules

Grammar files consist of one or more grammar rules. Each rule is defined by a unique name. Rule names cannot contain a period, colon, or hyphen character and cannot be named NULL, VOID, or GARBAGE. Rule names are also case sensitive. To continue expanding on the name example above, let's create the rule referenced above as firstName.

ABNF
$firstName = Jonathan;
XML
<rule id="firstName">Jonathan</rule>

The unique rule name in ABNF grammars is defined by the character string to the right of the $ character. This particular rule is very simple in that it will only match my first name, Jonathan.

XML grammars define rules using the <rule> element. The unique rule name is contained in the id attribute. 

Grammar rule scope

By default, VoiceXML 2.0 grammar rules are private. This means that rules can only be referenced within the same grammar file. If we wanted a VoiceXML dialog or another grammar to reference a grammar rule, we need to specifically scope it as public.

ABNF
public $firstName = Jonathan;
XML
<rule id="firstName" scope="private">Jonathan</rule>

To scope a grammar rule as public or private in ABNF grammars, pre-pend (public or private) to the rule definition.

For XML grammars, include the scope attribute to the <rule> element where the value is either public or private.

One-of (or lists)

So far, our name grammar is not very useful because it only matches my first and last name. What we want to be able to do is to expand the list of possible names to include last names for people whose first name is Jonathan.

We can do this by creating a name rule that includes a rule reference to a list of first and last names.

ABNF
$name = $firstName $lastName;
$firstName = Jonathan | Jeff;
$lastName = Eisenzopf | Franklin | Smith;
XML
<rule id="Name">
   <ruleref uri="#firstName"/>
   <ruleref uri="#lastName"/>
</rule>
<rule id="firstName">
   <one-of>
      <item>Jonathan</item>
      <item>Jeff</item>
   </one-of>
</rule>
<rule id="lastName">
   <one-of>
      <item>Eisenzopf</item>
      <item>Franklin</item>
      <item>Smith</item>
   </one-of>
</rule>

As you can see, the pipe character is the delimiter for alternative utterances in SRGS ABNF grammars. The lastName grammars will match one of Eisenzopf, Franklin or Smith.

For XML grammars, the <one-of> element may contain one or more <item> elements which contain a string (or token sequebce) for each alternate utterance.

The name grammar combines the firstName and then the lastName grammar via rule references to create a full name grammar, which is capable of recognizing a combination of first and last names.

The list of possible utterances that the namegrammar could match are as follows:

  • Jonathan Eisenzopf
  • Jonathan Franklin
  • Jonathan Smith
  • Jeff Eisenzopf
  • Jeff Franklin
  • Jeff Smith

Operators

The Speech Recognition Grammar Specification (SRGS) also includes some very useful operators that allow us to create complex word patterns that reflect natural language by defining grammar tokens and rule references as optional and/or repeatable.

Making Tokens Optional

Since callers may response to a prompt using different words, grammars must be able to define optional words. For example, when asked to say their name a caller might say any one of the following:

  • "Um, my name is Jonathan"
  • "My name is Jonathan Eisenzopf"
  • "Um, yeah, well, I'm Jonathan Eisenzopf"

We need to be able to identify and capture all of the words that were uttered in addition to the name. Also, notice that the caller might only give us their first name, so the last name might also be optional.

ABNF
$name = [um [yeah well]] ([my name is] | [I'm])
        $firstName [$lastName];
XML
<rule id="name">
   <item repeat="0-1">um
      <item repeat="0-1">yeah well</item>
   </item>
   <one-of>
      <item repeat="0-1">my name is</item>
      <item repeat="0-1">I'm</item>
   </one-of>
   <ruleref uri="#firstName"/>
   <item repeat="0-1">
      <ruleref uri="#firstName"/>
   </item>
</rule>

For ABNF grammars, optional tokens are defined by surrounding them with a set of square brackets. Optional tokens can also be grouped. In the grammar above, the first set of outside brackets will optionally match "um" or "um yeah well". Following the first set of optional tokens, we have used parenthesis to group an optional list of alternative utterances. The grouping operator (parentheses) are only used in ABNF grammara because XML grammar are explicitly defined. This grammar phrase will optionally match "my name is" or "I'm" or nothing (because they're optional). Additionally, this grammar will match a first or first and last name because the $lastName grammar is surrounded by square brackets.

Unlike ABNF grammars XML grammars use the repeat attribute within the <item> element to define optional tokens and rule expansions. This is done by setting the value of the repeat attribute to 0-1 which means "zero or one." If you remember earlier, the <item> element can be used to encapsulate token sequences. In the example above, we enclose the item containing the token sequence "yeah well" within the item containing the token "um". The repeat attribute in both is set to 0-1 which means that the optional utterance "um" may be followed by the optional utterance "yeah well". Next, we match the optional list of utterances, "my name is" or "I'm" by enclosing them in a <one-of> element. Lastly, we include the firstName grammar with <ruleref> and make the lastName grammar optional by enclosing the associated <ruleref> in a <item> element whose repeat attribute is set to 0-1.

Zero or More

We can match zero or more instances of a token by appending the repeat operator after an ABNF sequence or by setting the repeat attribute of the <item> element to 0-.

ABNF
$mood = I am very <0-> happy;
XML
<rule id="mood">I am 
   <item repeat="0-">very</item>
   happy
</rule>

The repeat operator for ABNF grammars are enclosed with <>. The syntax for XML grammars is very similar to defining optional tokens.

The example grammar above will match any of the following utterances:

  • "I am happy"
  • "I am very happy"
  • "I am very very very very very happy"

One or More

This is very similar to zero or more except that the token must occur at least one time in the utterance.

ABNF
$mood = I am very <1-> happy;
XML
<rule id="mood">I am 
   <item repeat="1-">very</item>
   happy
</rule>

So, the example above would match any of the following grammars:

  • "I am very happy"
  • I am very very very happy"

But it would not match:

  • "I am happy"

Token Ranges and Exact Matches

We can also specify a range of instances or an exact number of instances that a token can occur in an utterance.

ABNF
$eat = Please <1-5> eat your food;
$eat = Please <5> eat your food;
$eat = Please <5-> eat your food;
XML
<rule id="eat"> 
   <item repeat="1-5">Please</item>
   eat your food
</rule>
<rule id="eat"> 
   <item repeat="5">Please</item>
   eat your food
</rule>
<rule id="eat"> 
   <item repeat="5-">Please</item>
   eat your food
</rule>

The first example would match one to five instances of the word please. The second example will match exactly five instances and the last example will match at least five instances of the word please.

Conclusion

If you have absorbed the content of this tutorial, then you will be able to create almost any VoiceXML 2.0 grammar that's required. In the next tutorial, we will learn some of the finer details of SRGS.

About Jonathan Eisenzopf

Jonathan is a Senior Partner of The Ferrum Group, LLC  which provides speech IVR consulting, training, and professional services. Feel free to send an email to eisen@ferrumgroup.com regarding questions or comments about this or any article.

Share:
Home
Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved