I just installed the updated version of Google's Mobile App for iPhone because I wanted to test out the voice recognition search. My experience with speech recognition (excluding systems that recognize only a small subset of English, such as digits) has been spotty at best. I know Google has been collecting data using their 411 service for a while, so perhaps they can improve.
I had no faith that the first query I chose would work. Google surprised me when it properly interpreted the query: [can has cheezburger]. Not only did it get the first two words correct, it also used the correct spelling for "cheezburger", even though that word exists in no dictionary. If I just say "cheeseburger", it figures out the correct spelling for that too.
This is one of the reasons I'm so excited about Computer Science. A product like this couldn't have been created even a few years ago because of the vast amount of data that is required to power it (data such as a web corpus, search logs, and 411 voice samples). In fact, in this day and age, data is perhaps more important than algorithms. By throwing enormous amounts of data at the problem, Google's voice recognition algorithm can perform better than any of the best algorithms that don't utilize data. This has only been possible recently because of the affordability of processing terabytes and petabytes of data. I recommend Programming Collective Intelligence for a quick introduction to some of the algorithms enabled by massive amounts of information.
So what's next? I think Google should hook up their excellent voice recognition system with their best of breed machine translation software and create a universal translator. Imagine an iPhone app where you select an output language, speak into the phone, and get back a text or voice translation in the target language. It is still a pipe dream for now. For one, they'd need to port their voice recognition system to other languages, or it'd be a one way conversation. Actually, port is probably the wrong word, "train" would be more accurate. Algorithms created using data—and not code—are language independent from the get go. Instead of writing a whole new algorithm for each language, the system would just need to be trained on an additional corpus, with perhaps a few tweaks to improve precision and recall. To make a truly Star Trek-worthy translator, they'd also have to work on reproducing human speech. Considering how far we've come in the past few years, I don't think a universal translator would take more than 2-4 years to come to fruition.