Technical books come in a series of waves. The first wave describes the technology and provides some examples. The second and third waves consist of more refined content and those mega reference versions that purport to culminate everything on the topic.
In a way, I think that the authors of this book, Chetan Sharma and Jeff Kunins, have hopscotched the whole evolution and produced a comprehensive title that includes gems that can only have originated from masters of the craft.
Chapters 1-4 provide a solid overview of the evolution of pervasive computing and its speech processing roots. Reading chapter 3 on the birth of VoiceXML was a bit nostalgic for me and properly gives credit to the many researchers and early developments that made it possible for inferior-brained humans like myself to develop speech applications.
The very next chapter is a complete VoiceXML reference. I was a bit perplexed by the placement of the chapter since reference chapters usually brush up against the appendix and are usually an afterthought. The chapter presents each VoiceXML element alphabetically including its syntax, a full description of what it is and how it's used, a list and description of its attributes, a list of the other elements that it can be contained in or that it can contain (parents and children), and an example of the element being used in the context of a larger body of code. After I read through the reference, I was actually happy with its placement. Developers shouldn't skip over this chapter.
Chapter 8 introduces grammars and speech synthesis tags. The chapter does a good job of presenting the new grammar specification that was introduced with VoiceXML 2.0. SRGF examples are presented in both ABNF and XML forms. Though the chapter will be appreciated by developers who are upgrading their grammar knowledge from VoiceXML 1.0, I wish that the authors had included one or two more examples of grammars working within a VoiceXML application. The brevity of the SSML portion of the chapter is appropriate since the concepts are easier to grasp and because SSML will not likely be used as extensively as grammars or other VoiceXML elements.
Chapter 9, which covers dynamic VoiceXML scripting, is short and to the point, as it should be. The writers assume that the reader already has a solid understanding of Web development. If you're new to developing dynamic Web applications, you may want to read through a book on Web scripting first. There's really only so much you can say about dynamic scripting in a VoiceXML book before it becomes a Web development book, which would be a bit off-topic.
Chapters 11, 12, 14, and 15 provide a wealth of information on the design and development process. These chapters were very rich in content and provide a basis for establishing best practices for speech application development. It was obvious to me that the information that is presented was not just invented to fill pages, but comes from a wealth of personal experience.
In conclusion, this is a very well rounded book on VoiceXML. I am very happy with the mix of content, summaries of important concepts such as linguistics, speech recognition, and speech synthesis, as well as the in-your-face examples and complete reference. In fact, I liked it so much that I will probably be using it as a standard reference in my company's VoiceXML training course.
About Jonathan Eisenzopf
Jonathan is a member of the Ferrum Group, LLC which specializes in Voice Web consulting and training. He will be teaching the VoiceXML Bootcamp June 10-13 in Washington, D.C. Feel free to send an email to firstname.lastname@example.org regarding questions or comments about this or any article, or for more information about training and consulting services.