Spoken Search: connotative descriptions vs denotative keywords
April 13, 2006
Automatic Speech Recognition (ASR) may be good enough in controlled acoustic environments, like your home, but it is not robust enough considering the quality of hardware and the mobile everchanging environments where you may want to make a spoken search. In other words, baseline recognition quality would be poor.
On top of that, and a bigger issue is that current search engines are optimized for key word(s) search and those keywords can be very specific and very very difficult to recognize, even for an ASR system with a large vocabulary in an ideal environment.
The merit of ‘asking’ rather than ‘typing’ a query, is that one can be wordy; the drawback is that we are limited in the choice of words we can speak (and reasonably expect to be recognized).
The lure of Spoken Search (and, more in general, spoken interaction with computers) will drive R&D in speech recognition to a point where it may be usable, but the real challenge for Spoken Search is to develop new search engines and methods based on connotative description(s) as opposed to the current denotative keyword(s) approach.
User: “I need the recipe of a French dish of boiled meats and vegetables”
System: searching for “Pot-au-Feu”