So far in our series on SALT (Speech Application Language Tags) we have looked at the basic architecture of SALT based applications. We have also taken a closer look at some of the "elements" of SALT and their syntax. Now that we are ready to start developing, we will take a look at the new Microsoft .NET Speech SDK which allows for the rapid development of SALT-based multimodal/telephony applications. We are going to be using the newly-released Beta 2 as the basis for this discussion.
Introduction to Microsoft .NET
Let's first take an introductory look at the whole Microsoft .NET initiative. From a speech application developers perspective, Microsoft .NET is an environment for developing, deploying and executing web-based and windows applications as well as XML/SOAP-based web services. The objective of the framework is to really arm the developer with a set of tools and a rich set of class libraries, and a flexibility of working with different programming languages (to suit his expertise level). As speech application developers we are focusing on the application business and presentation logic while leaving the "plumbing" work (e.g. connecting to databases, parsing xml etc. etc.) to the underlying tools and framework.
There are really three major components of Microsoft .NET: Microsoft .NET Framework, Visual Studio.NET and .NET Enterprise Servers. Microsoft .NET Framework itself is composed of a set of components, including:
- .NET Common Language Runtime (CLR) which provides an object and type system common across the .NET compatible languages and is responsible for performing all the ground work/memory allocation, thread/process management, managing security, etc.
- .NET Supported Languages: The .NET Frameworks provides developers with the flexibility of using a number of different languages, including the newly developed C# (C++/Java like), C++, Visual Basic.NET and JScript. Apart from these core languages, a number of third party vendors have developed support for a number of additional applications as well.
- .NET Class Libraries., a rich and comprehensive set of classes that provide a lot of pre-built functionality--they are available to all .NET supported programming languages. These include classes built around user interface development, web services, database access, networking, input/output, web application development, XML/XSLT processing, multi-threading, security, etc.
- ASP.NET as the name suggests, it is the next generation of the popular Active Server Pages (ASP) web development environment for creating dynamic web applications. ASP.NET based applications can use any .NET supported languages as a scripting language within a page execution model. ASP.NET has major advances from ASP, including support for web services, server-based web controls and XML-based configuration of web applications.
For years, Microsoft Visual Studio has been the de-facto standard for developing Visual Basic, Visual C++ and ASP based applications. Visual Studio.NET, the next revision of the Visual Studio toolset, builds on its success and supports .NET programming in C#, Visual Basic.NET, Visual C++ and Jscript.NET for web and Windows application development. It provides an integrated development and debugging platform for the development of Windows form-based GUI applications, Windows services, reusable components (or building blocks), web-based applications and web services.
.NET Enterprise Servers including SQL Server, BizTalk Server, Commerce Server, SharePoint Portal Server, Application Center, Content Management Server, Exchange Server, Host Integration Server, Internet Security & Acceleration Server and Mobile Information Server. These provide the basis of pre-built enterprise class applications and services.
.NET Speech SDK
So where does .NET Speech SDK fit in with Microsoft .NET? The beta 2 release of the SDK has three main additions to the .NET framework and toolset.
.NET Speech Add-in for Internet Explorer
This add-in allows IE to be used as a viewer for the applications. Through a parameter in the URL, the application can be tested and used in a voice-only environment (simulating the telephony-based application model) and a multimodal environment (which is really a real world usage of the application).