
Joshua Blake sent me an email pointing to a
blog post he wrote about a new acronym, OCGM, that he and
Ron George created to describe common aspects of human-computer interactions in Natural User Interfaces. That's a mouth full so I'll try to simplify it.
The technology you are using to read this post right now is most likely some type of Graphical User Interface (GUI). It might be Windows XP/Vista/7, or Mac OS X, or an iPhone or whatever. The point is that the primary means by which humans interface with computers today is through GUIs. GUIs have certain elements in common most notably all GUIs tend to use Windows, Icons, Menus, and Pointing devices (WIMP). The WIMP acronym has been in wide use for at least a decade to describe the primary human-computer interface components in GUI systems. As we attempt to find a clear definition of Natural User Interfaces - none have hit the mark as far as I'm concerned - its natural to want to find parallels in the language used to describe NUIs to those we have traditionally used to describe GUIs.
WIMP is the prevailing set of metaphors used to design GUI systems. What is the prevailing set of metaphors used to describe NUIs? In order to find an achronym I think you need to have a large number of successful NUI implementations to distill and in truth we really don't have that yet. Therefor any attempt to come with NUI's equivalent to WIMP is probably going to fail in the long run. Just for fun, however, I decided to take a closer look at three attempts to find NUI's "WIMP": PETA, Post-WIMP, OCGM. But before I examine those attempts let me explain my two-part litmus test for considering them.
The first thing I'm looking for in a WIMP-like acronym for NUI is a term that is general enough to include - at the very least - three of seven technologies that I feel fall under the NUI umbrella as outlined in a
slide presentation I gave to the International Association of Software Architects earlier this year.
NUI includes the following technologies (and probably others):
- Touch UI
- Voice UI
- Gestural UI
- Tangible UI
- Organic UI
- Augmented Reality
- Automatic Identification
The first three technologies (i.e. Touch, Voice, and Gestural) are the most important and perhaps the most likely to be thrown in to the NUI mix by other people. If a term used to describe a general metaphor for human-computer interaction in NUI technologies doesn't support at least the first three items in this list than its not applicable to NUI.
In addition to being inclusive enough, the terms used to describe NUI's "WIMP" has to be concrete enough to provide guidance. Windows, Icons, Menus, and Pointer are all pretty clear. An acronym for NUI should be equally as clear or its not useful.
PETAJonathan Brill took
a shot at this some time ago when he introduced the term PETA on his blog, which stands for Places, Animations, Things, and Auras. PETA wasn't intended to be generalized to NUI. It was specifically aimed at mulit-touch. So PETA doesn't pass the first test of being inclusive. It's all about Touch UI which is what was intended.
Post-WIMPAnother terms that has seen some use lately is
post-WIMP, which simply means whatever comes after WIMP. Not much of a definition but certainly inclusive. This one passes the test of including the three primary NUI technologies in my list but fails to be concrete enough to be useful.
OCGMOCGM which stands for Objects, Containers, Gestures and Manipulations is the most recent submission for consideration by the NUI community. The acronym was created by Ron George and Joshua Blake. According to Ron and Joshua
OCGM is pronounced Occam, as in Occam's Razor (Occam is the guy in the picture above).
As much as I admire Ron George and Joshua Blake I'm sorry to say that OCGM fails both of my tests. It is at once non-inclusive of the three primary technologies I outlined as well as being to ambiguous to be useful. In addition, the terms used in the acronym overlap so much as to be redundant. Let me explain.
While you may not agree that all seven of the technologies I listed (i.e. Touch, Voice, Gesture, Tangible, Organic, AR, and Automatic Identification) are types of Natural User Interfaces, most people will agree that Speech, Gestures and Touch UIs are definitely NUI technologies. OCGM doesn't support speech. Speech is not a gesture nor is it a manipulation - at least not according to the definitions of those terms as they are understood today. A Direct Voice Input command is based on prompts, grammars, and dialog logic. Not gestures and manipulations. For this one reason OCGM doesn't work as a general metaphor for NUI.
There are other problems with OCGM. The biggest in my opinion is overlap and generalization. "O" stands for Objects. What is an object? That is a term that is so general as to be almost useless. "C" stands for Containers, which is equally as ambiguous not to mention its redundant. Isn't a Container a kind of Object? Finally, the distinction between gestures and manipulations while it makes sense in some cases is also redundant. It's like differentiating between application menus and context or pop-up menus. I would argue that any manipulation is in fact a gesture and if a gesture is used to manipulate the behavior of a computing system than its a manipulation. Manipulation may be a subtype of Gesture or Gesture might be a kind of Manipulation - either way, listing them both doesn't help the definition at all.
The Three Aspects of Any Human-Computer InterfaceSo what do I suggest? I don't. That's the easy way out, but I can list three aspects of any human-computer interface that must be considered in order to create a WIMP-like term for NUI.
- Affordance
- Command
- Feedback
AffordanceSimply put, an affordance is some indicator of what can be done with the system. If there is no indication that you have to touch the monitor to make it work or that you need to speak to a device to give it commands than there is no real affordance built into the system. Afforadances tell you what commands you can use with a computer.
CommandsThe idea behind a command is that it's any method by which a human communicates a desired action to the computer. It might be a gesture, a touch of a button, or a spoken command.
FeedbackThis is obviously how the system responds to commands. If you execute a command on a system and nothing obvious happens, than you have no idea if your command worked. The NUI system has to respond in some way. In multi-touch that might be an animation or an aura, in a speech application it might be a vocal reply or a non-vocal sound.
Where are Objects?You probably noticed that these general aspects of human-computer interaction do not explicitly include "objects" where an "object" might be a physical device, controller, or graphical representation of data being manipulated. The truth is that an "object" or "thing", as I believe they are intended to be used in OCGM and PETA respectively, are in fact affordances or feedback or both. A Window provides both an affordance of what commands you can execute with your mouse as well as feedback in terms of how the graphical representation of the Windows behaves when its manipulated. Similarly, a prompt from a Voice UI (e.g. "How can I help you?") is an affordance and a voice or non-voice response is feedback. Are sounds things or objects? Some would say the are while others would say the are not, so in this case objects and things don't always work for NUIs especially speech recognition systems that are audio modal only.
ConclusionPETA was never intended to be generalized for NUI so its not really considered here. I don't think Post-WIMP works very well because its way too general to be useful. I also don't think that OCGM works all that well because its not inclusive enough, its redundant, and its too abstract.
The three aspects of any human-computer interface paradigm are Affordances, Commands and Feedback. If you don't fulfill all three then you don't have a complete meta-metaphor. In addition, if you can't drill down on each aspect than you don't have a useful guideline for people using a specific human-interface computer paradigm (e.g. Command-Line, GUI, or NUI).
As I said before I think its just too early to try to declare NUI equivalent of WIMP - we haven't seen enough successful implementations to do that yet. In addition, we haven't even clearly defined NUI. Until we can do that we can't come up with a general metaphor common to all NUI applications.
One more thing. The types of solutions used to create NUIs are pretty broad - much broader in terms of inputs and outputs than GUI - so it may be that there is not NUI equivalent to WIMP.