Thursday, December 01, 2011
People have been working on speech recogntion for a long time. My first exposure was a demonstration of Shoebox, a calculator with speech input, at the IBM pavillion at the 1964 World Fair. In spite of years of research and hacking, speech recognition has remained niche technology.
Have we finally seen the start of practical, ubiquitous speech recognition with Apple's Siri? Maybe.
Siri has a lot of infrastructure support that earlier speech recognition systems lacked. It sends the speech back to a server for recognition and that server has assimilated clues from massive amounts of data on speech patterns. Once recognized, it relies on other services for search and to look for answers to questions. If you ask how far it is from Los Angeles to New York, it will go to WolframAlpha for the answer. Ask it where to get Indian food in your neighborhood and it will go to Yelp. (What will Apple do if you ask where to find bomb-making instructions or dirty pictures)?
Google seems to have the recognition part down, but may be playing catch up with input parsing and answer retrieval.
In spite of Apple's secrecy, Siri has attracted a hobbyist following. Check out this video of a hobbyist using Siri to control lights and other things in a room.
The developer of that app had to jump through hoops using SiriProxy to get it to work. Here's hoping Apple provides tools to encourage this sort of thing -- that might be what it takes to finally get speech recognition off the ground.