Microsoft is about to release a free Beta version of its .NET Speech SDK this week. This is Microsoft's first foray in the growing voice Web market where speech recognition technology is fused with Web technologies and the telephone.
The .NET Speech SDK will support the SALT XML specification developed by the SALT Forum, a standards group formed by Microsoft, instead of VoiceXML, which is being developed by the World Wide Web Consortium. Though SALT supports much of the same functionality as VoiceXML, it has branched out from a telephone constrained model into a multi-modal approach where developers will be able to deploy speech applications to Web browsers, telephones, and mobile devices.
For example, in the future, users of a SALT-enabled PDA application that provides driving directions could enter their starting location by speaking an address, writing it, or by selecting the location on a map. This type of seamless multi-modal interactivity that utilizes the same Web programming model will drive a new wave of applications that will be more natural and flexible for users. But SALT is not just limited to the PDA.
For now this Beta release will only support the desktop version of Internet Explorer and telephone access via a telephony emulator. Support for Pocket Internet Explorer for PDAs is to be released in the near term.
The Beta SDK integrates with the Visual Studio.NET development environment and will include an extension for Microsoft Internet Explorer (IE). This extension will enable the browser to interpret SALT tags and execute voice content in-line with other XHTML content. Because the SALT programming model is Web-centric, developers who are familiar with the IE DOM programming model will be able to learn the SALT language quickly.
The SDK will also include an ASP.NET control for integrating speech dialogs with dynamics scripts on an IIS server, a tool for developing Speech Recognition Grammar Format (SRGF) grammars, a prompt editor for recording and editing recorded audio prompts and a SALT debugger. SRGF is the format that is used to define the words and phrases that can be recognized by the speech recognition software. Interestingly enough, SRGF was developed by the W3C and is also being used as the speech recognition format for VoiceXML 2.0.
Developers will be able to test applications on their desktop via a telephony emulator, speech recognition software from Microsoft, and a text-to-speech engine licensed from Speechworks, all of which will be included with the beta SDK.
Web pages that integrate SALT tags can be deployed by installing the IE SALT extension and speech recognition software included with the SDK. Telephone-based applications can be tested on the desktop via the telephony emulator in the short term, with Microsoft releasing a .NET Speech server by the end of the year according to James Mastan, Group Product Manager of the .NET Speech Platform. A SALT extension for the Pocket Internet Explorer browser will allow SALT applications to run on Pocket PC devices, and is expected to be released in the near term. A specific date has not yet been announced for that release.
About Jonathan Eisenzopf
Jonathan is a member of the Ferrum Group, LLC which specializes in Voice Web consulting and training. He has also written articles for other online and print publications including WebReference.com and WDVL.com. Feel free to send an email to firstname.lastname@example.org regarding questions or comments about the VoiceXML Strategy series, or for more information about training and consulting services.