Book Review: VoiceXML, Strategies and Techniques for Effective

Monday Sep 23rd 2002 by Jonathan Eisenzopf

VoiceXML, Strategies and Techniques for Effective Voice Application Development with VoiceXML 2.0 is the VoiceXML book I wish I had written. Find out why this recently published work is this writer's favorite VoiceXML title currently in print.

Technical books come in a series of waves. The first wave describes the technology and provides some examples. The second and third waves consist of more refined content and those mega reference versions that purport to culminate everything on the topic.

In a way, I think that the authors of this book, Chetan Sharma and Jeff Kunins, have hopscotched the whole evolution and produced a comprehensive title that includes gems that can only have originated from masters of the craft.

Chapters 1-4 provide a solid overview of the evolution of pervasive computing and its speech processing roots. Reading chapter 3 on the birth of VoiceXML was a bit nostalgic for me and properly gives credit to the many researchers and early developments that made it possible for inferior-brained humans like myself to develop speech applications.

Chapter 5 does a decent job of reviewing the various development environments and primes readers for chapter 6, which introduces readers to the VoiceXML language. The authors didn't water down this chapter at all. They dove right into the core concepts of the language and provided generous amount of text to accompany the examples. My only concern with the chapter is that it might be a bit too heavy for readers that aren't already professional developers. That's ok with me, after all, this book is part of the "Professional Developer's Guide Series." I did appreciate that the authors used Javascript quite extensively in their examples, while other books only mention it but don't really delve into the topic. Even though the chapter was fairly long, I still felt like something was missing. Perhaps not enough was covered in the chapter. This perceived shortcoming is forgivable however.

The very next chapter is a complete VoiceXML reference. I was a bit perplexed by the placement of the chapter since reference chapters usually brush up against the appendix and are usually an afterthought. The chapter presents each VoiceXML element alphabetically including its syntax, a full description of what it is and how it's used, a list and description of its attributes, a list of the other elements that it can be contained in or that it can contain (parents and children), and an example of the element being used in the context of a larger body of code. After I read through the reference, I was actually happy with its placement. Developers shouldn't skip over this chapter.

Chapter 8 introduces grammars and speech synthesis tags. The chapter does a good job of presenting the new grammar specification that was introduced with VoiceXML 2.0. SRGF examples are presented in both ABNF and XML forms. Though the chapter will be appreciated by developers who are upgrading their grammar knowledge from VoiceXML 1.0, I wish that the authors had included one or two more examples of grammars working within a VoiceXML application. The brevity of the SSML portion of the chapter is appropriate since the concepts are easier to grasp and because SSML will not likely be used as extensively as grammars or other VoiceXML elements.

Chapter 9, which covers dynamic VoiceXML scripting, is short and to the point, as it should be. The writers assume that the reader already has a solid understanding of Web development. If you're new to developing dynamic Web applications, you may want to read through a book on Web scripting first. There's really only so much you can say about dynamic scripting in a VoiceXML book before it becomes a Web development book, which would be a bit off-topic.

Chapters 11, 12, 14, and 15 provide a wealth of information on the design and development process. These chapters were very rich in content and provide a basis for establishing best practices for speech application development. It was obvious to me that the information that is presented was not just invented to fill pages, but comes from a wealth of personal experience.

In conclusion, this is a very well rounded book on VoiceXML. I am very happy with the mix of content, summaries of important concepts such as linguistics, speech recognition, and speech synthesis, as well as the in-your-face examples and complete reference. In fact, I liked it so much that I will probably be using it as a standard reference in my company's VoiceXML training course.

About Jonathan Eisenzopf

Jonathan is a member of the Ferrum Group, LLC which specializes in Voice Web consulting and training.

