I'm a big fan of combining Direct Voice Input and multi-touch into a single device. In fact, voice can be used to complement any kind of NUI technology be it touch or gesture based.One of my favorite voice applications is Google Voice an iPhone application from Google that allows you to do Google searches using voice commands. For example, the other day my son asked me, "How big is Earth?" I immediate took out my iPhone and asked Google Voice, "How big is the Earth?" The answer came back in 2-3 seconds in the form of a Google results page. The information I needed was in the top link - page from factrain.com - and it took only 2-3 seconds to access it directly from the Google Voice application. Factrain.com provided me with the circumference, surface area, and volume of the Earth. Total time to come up with an answer to a pretty good question: maybe 10 seconds top. That's power.
I also enjoy using the iPhone's voice dialing features which rarely fails me and has probably saved my life more than once while driving. I can ask the phone to dial a specific number or "Call Home" without looking at my iPhone even once. It's wonderful.
Last night I gave a talk on NUI to a local programmers' user group. After the presentation I had a number of questions, all of which were great, but the one that stuck out in my mind was, "Can you use a Voice User Interface with a Desktop". My answer at the time was that there was nothing technically preventing it and that in fact it was already possible with Windows or the Mac. The problem was one of context. I work from my home office with the door shut and so using voice to control my computer would be advantageous when combined with keyboard, mouse and possibly touch. But what about the majority of people who work in open work spaces such as cubical farms - you know, those huge rooms with a hundred cubicals and very little privacy? In that setting using voice to help navigate your computer could be embarrassing. "Computer, play Brittany Spears' Circus Breathe."
eMachine recently revealed their EeeTop ET2002 to Netbooknews.com. The EeeTop ET2002 is an all-in-one desktop computer with a touch screen and some voice controlled applications. To say it is disappointing is an understatement. From the looks of it (see video below) you have to touch a graphic on the screen to activate the voice interface before every command. That's redundant because each voice command starts with a verb (e.g. "Play album Circus Breathe"). Maybe eMachine will get this right before shipping, but for now its still just a toy which is too bad because voice can be a wonderful NUI input method.
The point of all this is that Direct Voice Input makes sense, but it depends on the context and the way its implemented. Apple and Google have done an excellent job with the iPhone voice activated dialing and the Google Voice application; eMachine has done a horrible job with its EeeTop ET2002. The use of voice should not be treated as a novelty, but as a serious interface in specific contexts. A little thought to the design could have make the EeeTop ET2002 really useful to people who have the luxury of working in private.
This brings me to another point about the use of Direct Voice Input. It depends on social conventions. For example, I can use Voice dialing on my iPhone in a public setting without causing rubber-neck because the device is a mobile phone. It's common to see people talking into their phones even in the most crowded situations. However, if I opened my laptop and started talking to it in the waiting area of an Airport people would think I was either insane, stupid, or showing off. Again its a matter of context. Where are you? Who is near you? What are you doing?
Voice is going to be an import part of NUI paradigm shift, but as is the case with any NUI technology the context in which the interface is used determines, in large part, its usefulness.
4 comments:
It may be because I'm French, but voice recognition never work on my iPhone...
Voice input doesn't work at all if its not tuned to the local of the user.
I have a friend, in the USA, that is English and the iPhone voice input doesn't work for him either.
Of course you have the same problem with other forms of NUI input such as gestures which change across cultures. For example, when I lived in Germany 20+ years ago I learned that the OK gesture as given in the U.S. means something totally different to Germans.
Indeed, it's a problem that you have with other input system. But I think it's a major problem for voice, since each region has it's own accent...
User testing will become really hard... Or recognition system will have to be far more robust.
Is there problem for people from California vs people from Texas for example ?
I did some Googling and I found a lot of articles complaining about the Google app (i.e. Google Voice) not working with British accents but not much on the iPhone voice activated dialing which leads me to believe that the accuracy depends in large part on the implementation (Duh! I know). Perhaps AT&T's speech recognition is better than Googles?
As far as accents go I could not find anyone complaining about American accents. I'm from Minnesota and people say I have a strong "Fargo" like accent yet both applications work fine for me.
It's worth noting that Google or AT&T could update their speech recognition software to support specific accents or locals without impacting your phone at all. This is a nice advantage of having these services be remote rather than resident on your phone. Although, even if it was on the phone, Apple could do a software update and fix it with the iPhone.
I think the first accents to be addressed by Google will probably be English accents (British and Australian). Hopefully they will get to non-English languages (e.g. French) fairly quickly.
Post a Comment