|Posted Jul 9, 2007|
There are many ways a user can respond to the prompt What would you like to drink? While some of us might want a triple martini or an intergalactic gargle blaster, let’s suppose that the user only wants a Coke. The developer specifies a grammar containing the words and phrases Coke, Coca-Cola, or that fizzy brown drink. The speech recognition system compares the user utterance with each word and phrase in the grammar and chooses the word or phrase that most closely matches.
How does the speech application know that Coke, Coca-Cola, or that fizzy brown drink actually mean the same drink? One approach is to have the speech application look up these words in a translation table. A better approach is to embed the translation of each word within the grammar so that when the user speaks either Coke or that fizzy brown drink, the speech recognition engine will translate the words to Coca-Cola.
Just as the World Wide Web Consortium (W3C) made Speech Recognition Grammar Specification (SRGS) the standard for defining the grammars used by a speech engine, the W3C has specified Semantic Interpretation for Speech Recognition (SISR)as the standard for developers to interpret the words recognized by the speech engine.
SISR uses the ECMAScript Compact Profile, a strict subset of ECMAScript designed to meet the needs of resource constrained environments. Special attention has been paid to constrain ECMAScript features that require large amounts of system memory and processing power. In particular, it is designed for use in a lightweight environment. Thus, ECMAScript fits snugly within the grammar rules for extracting semantic information from the words recognized by the speech engine.
In addition to translating word aliases to the preferred word, as in the Coca-Cola example above, developers also specify the following tasks with SISR:
Most vendors have adopted VoiceXML 2.0, making it possible to port applications to competing speech platforms. Now that SISR is a W3C standard, vendors should support SISR in addition to their own proprietary languages. The VoiceXML Forum plans to update its VoiceXML certification program to include testing SISR. Make sure that your speech vendor supports SISR so your grammars and applications will be ported between platforms more easily.
Jim Larson is an independent consultant and VoiceXML trainer. He is the author of The VXMLGuide [www.vxmlguide.com]. He can be reached at firstname.lastname@example.org.