Friday, August 28, 2009

The Next Killer App: Augmented Reality

Augmented Reality, according to the Wikipedia, is:

a term for a live direct or indirect view of a real-world environment whose elements are supplemented with-, or augmented by computer-generated imagery. The augmentation is conventionally in real-time and in meaningful context with environmental elements.

A lot of the research into AR has focused a form factor that projects annotations on to a lens (e.g. transparent shield or video feed) between the user and the actual objects. Examples include Heads-Up displays used by fighter pilots, visors or goggles, and video devices such as mobile phones or slates. In this case the user is given additional information which is projected or otherwise rendered on a lens of some kind. Look through the lens and you see the additional information.

Another form factor is to project the augmented data onto the object you are looking at. A couple of good example come from the Sixth Sense demonstration at TED this year where a keypad was projected on the users hand or meta data about the student was projected on their shirt.

A third form factor is to superimpose information onto objects in a video display. Examples of this include the 3D baseball cards and the First Down line while watching American football. The difference between augmenting a video and an augmented lens video is that you don't look through the augmented video as you do with a lens; you look at a computer monitor to television.

There is also various types of data that can be displayed. Some data, like that in a heads-up display, is monitoring information rather than supplemental information about the objects in the field of view (e.g. speed, altitude, etc.).

F-16 Fighter Jet VS. Bird


Another type of information is based on location, for example the location of the nearest subway entrance or the location on which people left messages or tweets.



The third, and perhaps the most difficult to implement, is object recondition and annotation. The ability to recognize a face, or some type of object, and display associated data. This requires that the application be able to recognize faces or objects rather relying on geo-location and compass direction data as is the case with location based data.



These are all examples of different form factors and data that can be displayed in an augmented reality application. They are not mutually exclusive, but the ones that you'll be seeing the most are the augmented video and the augmented lens. Specifically there are a number of examples of mobile phone applications which provide an augmented lens.

The ReadWriteWeb.com provides a pretty good run down of different AR implementations in this article. I've also covered the topic before (here, here, hear, and hear).

The AR application I want to show today is by a company named Accrossair called, Nearest Tube, for the iPhone 3.1 OS on the 3GS version. Here is an interview with the designers that is pretty illuminating from a developers perspective.

The biggest problem with the mobile AR applications that use the lens form factor is the way in which information is presented. As we see more and more of these applications introduced, especially for the iPhone, the way in which annotations on the screen are shown will undoubtedly improve.

Wednesday, August 26, 2009

Move Over Jonathan Ive, Hear comes Jon Doe

One of the things I've come to understand about multi-touch and NUI in general is that design of these types of interfaces is going to require that we leverage what we know from GUI but also that we introduce really new ways of doing things.

Recently while trolling the Internet I discovered an article by Cult of Mac which pointed me to the work of an anonymous designer who has thought long and hard about how to design applications for the rumored Apple tablet computer. I've seen a number of mochups and pretend demos of the Apple tablet but this guy's work is the most detailed and the best I've seen. It's in an entirely different league.

According to Cult of Mac he has remained anonymous so that he doesn't damage his chances of being hired by Apple after he graduates from collage or grade school. Either he is being sincere in his wish for anonymity, or he an extremely clever self-promoter. Either way his anonymity is intriguing.

Below is the first in what is currently 10 videos which brilliantly explore his user interface design for the rumored Apple tablet. His work is explained in detail, and his own doubts are exposed with sincerity in video #10. I don't know who this Jon Doe is, but I like his designs and the way he thinks through design problems. If Apple doesn't hire him, it will be a huge opportunity cost for them. After all he hasn't even graduated yet and his work is superb. If Apple does hire him, than all I can say is "Mover over Jonathan Ive, here comes Jon Doe".

Update August 28

Ron George and I exchanged some comments on this article where he criticized the design of Jon Doe and I asked him to provide more details. To my delight, Ron did exactly that! On his own blog Ron provided a brief assessment of the 1st and 10th videos by Jon Doe as well as some ideas for improvement. I'm guessing that it took Ron some time to do this and I want to express my gratitude as I learned a lot by reading the post.

After taking a close look at Ron's constructive criticism and comparing it the video, I had to agree that some of the gestures just don't make sense. That doesn't mean that Jon Doe hasn't done a great job. He obviously worked hard and the moch-ups are fun to watch. I think Jon Doe has a place at the Apple design table and will one day be a great designer, but Jonathan won't have to give up his seat just yet. ;-)

Here is Ron George's post.
http://rongeorge.com/design/interaction-design/the-fake-apple-gestural-movies-a-critique-by-request/


Direct Voice Input and NUI

I'm a big fan of combining Direct Voice Input and multi-touch into a single device. In fact, voice can be used to complement any kind of NUI technology be it touch or gesture based.

One of my favorite voice applications is Google Voice an iPhone application from Google that allows you to do Google searches using voice commands. For example, the other day my son asked me, "How big is Earth?" I immediate took out my iPhone and asked Google Voice, "How big is the Earth?" The answer came back in 2-3 seconds in the form of a Google results page. The information I needed was in the top link - page from factrain.com - and it took only 2-3 seconds to access it directly from the Google Voice application. Factrain.com provided me with the circumference, surface area, and volume of the Earth. Total time to come up with an answer to a pretty good question: maybe 10 seconds top. That's power.

I also enjoy using the iPhone's voice dialing features which rarely fails me and has probably saved my life more than once while driving. I can ask the phone to dial a specific number or "Call Home" without looking at my iPhone even once. It's wonderful.

Last night I gave a talk on NUI to a local programmers' user group. After the presentation I had a number of questions, all of which were great, but the one that stuck out in my mind was, "Can you use a Voice User Interface with a Desktop". My answer at the time was that there was nothing technically preventing it and that in fact it was already possible with Windows or the Mac. The problem was one of context. I work from my home office with the door shut and so using voice to control my computer would be advantageous when combined with keyboard, mouse and possibly touch. But what about the majority of people who work in open work spaces such as cubical farms - you know, those huge rooms with a hundred cubicals and very little privacy? In that setting using voice to help navigate your computer could be embarrassing. "Computer, play Brittany Spears' Circus Breathe."

eMachine recently revealed their EeeTop ET2002 to Netbooknews.com. The EeeTop ET2002 is an all-in-one desktop computer with a touch screen and some voice controlled applications. To say it is disappointing is an understatement. From the looks of it (see video below) you have to touch a graphic on the screen to activate the voice interface before every command. That's redundant because each voice command starts with a verb (e.g. "Play album Circus Breathe"). Maybe eMachine will get this right before shipping, but for now its still just a toy which is too bad because voice can be a wonderful NUI input method.

The point of all this is that Direct Voice Input makes sense, but it depends on the context and the way its implemented. Apple and Google have done an excellent job with the iPhone voice activated dialing and the Google Voice application; eMachine has done a horrible job with its EeeTop ET2002. The use of voice should not be treated as a novelty, but as a serious interface in specific contexts. A little thought to the design could have make the EeeTop ET2002 really useful to people who have the luxury of working in private.

This brings me to another point about the use of Direct Voice Input. It depends on social conventions. For example, I can use Voice dialing on my iPhone in a public setting without causing rubber-neck because the device is a mobile phone. It's common to see people talking into their phones even in the most crowded situations. However, if I opened my laptop and started talking to it in the waiting area of an Airport people would think I was either insane, stupid, or showing off. Again its a matter of context. Where are you? Who is near you? What are you doing?

Voice is going to be an import part of NUI paradigm shift, but as is the case with any NUI technology the context in which the interface is used determines, in large part, its usefulness.

Tuesday, August 25, 2009

Sony's New Touchscreen eReader

Just this morning Sony showed off its Kindle 2 killer, the Daily Edition eReader, which includes wireless connectivity, 17 shades of gray, and a touch screen.

It's larger than the Kindle 2 and, in my opinion, much nicer looking. It also has a touchscreen (Amazon.com are you paying attention?). As reported by Endgaget the device will be available for sale this December.

MT-50 Table: Big, Rugged, High-Resolution

Ideum, which I covered back in February, has released a new version of their 50" multi-touch table, the MT-50 Table, for non-profit organization such as museums. Ideum hardened the device claiming the large multitouch table is even more rugged than a Microsoft Surface. That's an impressive claim.

I've owned a Microsoft Surface for several months and it can take serious abuse. I have four children the oldest of which is 7 and they have climbed and stood and even stomped on the device many times. I have dropped my keys and other objects on it without a scratch and while I treated it with reverence for the first couple months after I bought it, its now just another piece of furniture in my office - all be it a really cool, interactive, piece of furniture.

The ruggedness as well as the Microsoft brand has been one of the Surface's biggest selling points compared to other multi-touch tables. Ideum is the first vendor to claim their device is even more rugged.

At a price tag of $21,000.00 the device is a bit more than the Surface, which when purchased with developer seats, installation, and other fees comes out to $18,000.00. A non-developer version of Surface would probably run about $15,500.00. But the Ideum MT-50 is larger and higher resolution and allows developers to create applications in Adobe Flash and other languages (e.g. python and Java). Those are some significant advantages and would seem to justify the higher price tag in my opinion. That said, I've never developed applications for this device or even seen it in person so there is a lot about it that is still a mystery to me.

Monday, August 24, 2009

Multitouch Support in Firefox - A Work in Progress

Felipe Gomes, a summer intern at Mozilla, has been on working multi-touch support in the Firfox browser in anticipation, I suspect, of the Windows 7 release this fall.

The video shows support for pinching and multiple fingers and looks pretty good. I wonder how Mozilla will handle support for multi-touch in JavaScript? Apple Safari, as reported in a past blog post, already has support for multi-touch with a JavaScript extension. In addition, Silverlight 3 supports multi-touch and I assume Adobe Flash will too (eventually).

It would be nice if all the browser vendors supported the same JavaScript constructs for multi-touch otherwise web developers will have to write different multi-touch Ajax applications for different browsers.




Friday, August 21, 2009

Sketching User Experiences, The book

I just finished reading Sketching User Experiences by Bill Buxton and found it to be very enlightening and an excellent read. I give it my highest recommendation for people interested in design.

The book is about the art of sketching or creating very rough, throwaway models - on paper or in 3D - of devices and interfaces early in the design process. The book is also about "Holistic Design" by which Mr. Buxton means design that is a product of business, user experience, and technical considerations.

One of the themes early in the book, which really resonated with me personally, was the idea that computer software and device design is in a state of transition. Traditionally, the design of computer devices has been treated as separate consideration from the design of the software that runs on them - this is true even for the Apple Macintosh as pointed out in Buxton's book. However, as the devices and their software becoming more intimately connected (e.g. the iPhone and Microsoft Surface) this separation of concerns becomes less desirable.

In the future a clear separation between hardware and software will disappear so that it will be perceptually and technically impossible to determine exactly where industrial design ends and human-computer interface design begins.

This, in my opinion, is one of several concepts at the heart of Natural User Interfaces; the fact that the device and the software is indistinguishable. A user experience cannot, in my opinion, be NUI if the device and the software are easily distinguishable as is the case with modern personal computers today.

NUI is about making the user experience as natural as possible in terms of input and output when interacting with devices and software. If the user perceives a clear separation between the software and hardware than the natural effect is lost - they are too cognizant of the technology.

Friday, August 14, 2009

Apple iSlate for Education and Infotainment

Gizmodo has a new "insider" story on the rumored Apple tablet claiming that the device will be issued in two editions, one for the education market and one for ... well .... the rest of us.

Let's assume for a minute that this is true, after all Gizmodo is a pretty reliable news source. If Apple is marketing the new device for Education in addition to Infotainment, than I'm leaning toward an announcement in early September with a ship date during the Holidays of this year.

We can only guess at the name, but I just don't think "tablet" is all that likely to be a part of it because the term "Tablet PC" was coined by Bill Gates and I just can't see Job's buying into any thing coined by Microsoft let alone Bill Gates. So I'm leaning toward iSlate as the name of the device. It's cool. The term "slate" better represents the form factor of the rumored device. Also, the name harkens back to the early days of education where every kid used a "slateboard" to practice their letters and numbers.

But I've made other predictions on this device, which I really want Apple to release, and so take this one with a huge block of salt.