Book Review

The Humane Interface: New Directions for Designing Interactive Systems by Jef Raskin. Reading, Mass.: ACM Press, 2000. 256 p. $24.95 (ISBN 0-201-37937-6).

Tom Zillner, Editor

Jef Raskin has the chops to write a book about humane interfaces. He's been in the business of interface design for a long time. Before Steve Jobs took over the project personally, Raskin was the project leader for the Macintosh; the jacket copy for The Humane Interface somewhat hyperbolically terms him "the creator of the Macintosh computer project." The truth is that Raskin is much better than the eventual interface design of the Macintosh suggests, as his subsequent work demonstrates.

The clearest explanation of the book's raison d'ĂȘtre comes in its conclusion, where Raskin lays out what he hopes he's accomplished. Perhaps it would be a good idea for the reader to start with the conclusion, and discover that Raskin's goal is to explain how to make interfaces as simple as possible given the limitations and capabilities of human beings. Certainly a laudable goal, and one that he accomplishes principally by exploring "cognetics," the study of cognitive psychology as it can be applied to engineering.

It is in the exploration of cognetics that it first becomes apparent through empirical methods why our computer interfaces are as bad as they are: they have little to do with human abilities and fail to take into account the blind spots of the human mind. The biggest blind spot is our inability to easily deal with modes, the differing behavior of an interface depending on the state of the system. Because we are not "wired" to roll with these changes of state, we frequently use keystrokes or other gestures (clicking, moving the mouse, etc.) in ways inappropriate to the current state of our application or operating system. But our mistakes are not reflections of our own disability. They are indicative of the failure of engineers and programmers to design humanely. We should not have to adapt to the peculiarities of the computer; rather, the computer should reflect the ways humans behave.

Digressing slightly, I once had a colleague who espoused the view that eventually personal computers tended to develop personalities, and when they did, when they began to exhibit idiosyncratic "behavior," it was at that point that they became problems to use and maintain. He was right of course, but his view was too constricted. In actuality, as Raskin clearly reveals in his critiques of current computer system behavior in the light of cognetics, all computers and particularly their user interfaces possess personalities, and they are personalities that we humans are ill-equipped to deal with effectively. Thus our frustration in working with computers. Simple tasks become complicated, and complicated tasks may prove impossible. It's like trying to get eccentric Uncle Harry to type up a stack of mailing labels from your database. Sometimes it's just not worth the effort.

What's to be done? Raskin has lots of concrete ideas, but at first blush they don't seem to be applicable to the average applications programmer. Few of us are involved in the design of commercial or even open source operating environments, so we may believe that much of what Raskin talks about is of no use to us in our quotidian responsibilities of constructing and modifying library applications and Web sites. I believe otherwise, or it would be a waste to review this book for this journal. For one thing, Raskin provides some tools to measure the usability of interfaces and compare them quantitatively. The tools he introduces and demonstrates (in the chapter "Quantification") include the GOMS (goals, object, methods and selection rules) keystroke-level model, Hick's Law and Fitts' Law. Each requires calculations to be applied to characteristics of human-computer interactions, yielding measures of interface difficulty (or ease). It's difficult to present this material for non-engineers without getting bogged down in arcane formulae and the more rigorous aspects of mathematics. I'm ashamed to admit I am a relative illiterate when it comes to mathematics, but I found most of Raskin's exposition clear and not encumbered with needless complexity. He uses the example of the user interface for a program that performs temperature conversions (Fahrenheit to Celsisus and vice versa) as his vehicle for applying the calculations he discusses. This example is both elementary and quite illustrative of the principles he discusses.

Raskin also discusses the information-theoretic efficiency of interfaces: the minimum amount of information required to perform a task is divided by the amount of information that a user must supply. This formulation gets a bit hairier, but it is important, for if we measure the potential information efficiency of an interface we can then evaluate how proposed interfaces "measure up." Raskin continues to use the temperature conversion case to show that what at first seem like fairly efficient data entry methods are, in point of fact, too wasteful of users' time and are thus inefficient. Ultimately, he proposes a design that is both elegant in appearance and efficient in execution. Again, the whole chapter is presented so that the relevance of the quantitative measures is immediately apparent and at the same time one is not put off by the mathematics. Although it is clear that there is more difficult mathematics lurking beneath the gloss that Raskin provides, he offers sufficient information for most of us to produce better interface designs.

Once we exit the realm of quantification, we come to "Unification." The idea is that an application as a separate entity within an operating environment is just another example of a mode, which is a bad thing. Therefore, there should be no separate applications as such. I'll return to this controversial proposal in a moment, although to all appearances the idea that applications should not exist separate from the operating system is irrelevant to those of us in the programming trenches. After all, there is little that we can do in terms of the bigger picture. We are stuck with the operating systems in which we work. But before I explore that thread further, I should back up to the idea that modes are a bad thing. It's a hobby horse that Raskin rides pretty hard, so perhaps it ought to be examined more carefully. What is so bad about modes? As I noted above, modes are mainly bad because they make us more prone to make mistakes by assuming we are working in one mode of the application or system when we're actually in another. Key combinations or mouse movements or other gestures result in different results depending on the mode we're in, so we end up doing the "wrong" thing. A deeper question is why we often don't attend to mode changes. Even some fairly clear indication on the screen that we are in a particular mode will rarely result in consistent recognition by the user and consequent successful interaction. The key is in the question itself. We don't attend to mode changes because they are not our focus of attention.

Raskin does a good job of presenting the preliminary material that leads up to this insight. Essentially, we function using our cognitive conscious and our cognitive unconscious, and each has distinctly different properties, with consequent strengths and weaknesses for various sorts of tasks. The most pertinent characteristic of our cognitive conscious when considering attention is that our conscious mind can only focus clearly on one thing. So, as I write this review on my computer I am concentrating on the words I am writing and not really focusing on my hands on the keys or the clock in the upper right corner of the screen, or the animated icon of a book at the bottom of the page. My unconscious takes care of the typing while I focus on what it is that I want to write. And the words flow. But if I am focused on the words that I'm writing, mode changes will occur at the periphery of my thoughts, or more accurately in my cognitive unconscious. If I then focus on dealing with the mode change appropriately, I have switched my focus of attention away from the task at hand. My interaction with the application has stopped happening unconsciously and has pushed its way to my consciousness. But because my cognitive conscious operates serially, on one task at a time, I'm no longer doing my work but am instead wrestling with the interface. That's what makes modes bad. They detract from the tasks we want to accomplish.

Returning to the inherent modality of running separate applications, for almost all of us this is unavoidable. (Raskin designed a computer called the Cannon Cat that doesn't have separate applications, and I'm sure there are other even less well known exceptions.) I agree with Raskin's arguments that applications as separate entities are not a very good idea, except that as a practical matter it would probably be fairly difficult to produce marketable computer software in the way that he suggests, which is essentially as a command-by-command extension of the operating environment. In other words, it's a good idea whose time hasn't come and may never arrive. That doesn't make his discussion of unification of operating environment with applications a waste of time. Within it Raskin devotes a lot of time and attention to particularities of the user interface that can potentially be worked on even from the lowly standpoint of the application programmer, although it is often a struggle against the tyranny of the operating environment. Early on, he outlines elementary operations, those manipulations of content out of which all more elaborate or complex operations are constructed. Content can be indicated (pointed at), selected, activated and modified. The elementary types of modification are generation, deletion, moving, transformation and copying. Although Raskin rightly claims that these basic operations should be fundamental to the computer or appliance itself, the wily application programmer can partially compensate for deficiencies in the operating environment by constructing "fixes" local to an application or suite of applications.

Note that Raskin would hate the suggestion that application programmers should provide corrections to less than optimal user interfaces within their individual programs, because such changes introduce yet another set of modes, and can add to the overall confusion experienced by users. In general, I agree. I think that there are some reasonable exceptions to this rule. The best case for subverting the provided interface is where users never or almost never leave the application or set of applications. Consider data entry operators, who simply sit in front of screens all day long and pound in text. It should be possible to optimize the applications they use so that the programs become powerful data entry tools. Probably the most pertinent changes are to minimize use of the graphical input device (mouse or trackball) and allow all cursor movement functions to be accomplished through keyboard shortcuts. Turning to the library world, a cataloger needs access to the USMARC character set as a basic feature, with possibly any number of additional special language character sets if he or she catalogs materials from foreign countries that use non-Latin character sets. And it ought to be easy to change from one set to another. Windows provides some of this functionality, but not all of it. Why not build it in?

Turning to more general applications, Raskin discusses means to more intelligently select text (or other objects) and manipulate it (or them). In the case of text, one of Raskin's suggestion is that old selections be remembered, so that operations can be performed on both the currently selected text and one or more previous selections. For example, if the exchange operation is to be performed, the current selection and the previously selected block are exchanged. Think of what you currently have to do to perform this operation in Microsoft Word or most other word processing packages and it becomes clear how useful this easier exchange function could be. Turning back to our imagined data entry operators or catalogers, it might be useful for either set of users to have applications that allow easy exchanges.

An interesting example of unification and the consequent blurring between operations is the performance of calculations. Why not do them within text rather than requiring a separate application for calculations or a calculator that sits next to your computer? Select the text embodying the computation (e.g., 300 * .85) within your document. Now, select the word "calculate." Finally, push a key designated the Command key. The operation designated by the current selection is performed using previous selections as operands, and the original text is replaced with the result. In the case of "calculate," only one previous selection is used, but in, for example, the case of the "send e-mail" command perhaps the newest of the old selections would contain the e-mail address and the next oldest might contain the message body. Raskin points to several purported advantages of this method of supporting commands. For one thing, it again allows the avoidance of special modes to accomplish specific functions. For another, it allows the user to construct command lists containing frequently used operations. Or the commands might be contained in an online manual. Documentation could be consulted and commands executed directly from the documentation. Giving commands no special place to live (like menus) means that they can live anywhere, and is another opportunity to escape from modality (command mode versus entry mode), although it is hard to see how selecting several blocks of text and then a command and finally pressing a command key really is better in terms of retaining focus than selecting text and then pulling down a menu to select a command. In any case, it is desirable to let the user choose where commands reside so that the user's most frequently used commands are where they are easiest to access.

Another idea of Raskin's that has merit is the use of transparent error message boxes. Have you ever been working and had an error message box pop up and obscure the object of the message? Often you can relocate the box by dragging, but why should you have to? Raskin's idea is that all error message boxes could be transparent. In point of fact, all message boxes could be transparent. Again, this allows the user to keep his or her focus of attention on the content, and respond to the messages only as necessary. This could ameliorate the problem that Raskin describes where the information-theoretic efficiency of a message is zero, i.e., where a message requires the user to respond but the user can do nothing to change the state of affairs. An example of such a message is "Word has finished searching the document," requiring an "OK" response. This message could just as easily be delivered without requiring a user response in the type of transparent box Raskin describes and illustrates. Better yet, put an unobtrusive indicator off to the side and don't display a distracting and not very useful message. Note that the use of transparent boxes is one case where it is difficult for the application programmer to alter the contrary behavior of the operating system. I am thinking particularly about how I as a Visual Basic programmer could create such boxes under Windows. I suspect that there is a way to do it, but I'm sure it requires more complex programming than my capabilities allow. In other words, I'm stumped.

This comes back to the question of why you would want to read this book in the first place. If you work at all with user interfaces, I hope my review has convinced you that Raskin is worth a read. In particular, the formulae he discusses provide a very useful quantitative side to evaluating your proposed interface designs. Although there are no substitutes for usability studies, it's extremely useful to submit only the best possible designs for testing. Finally, although I have given as short shrift as Raskin does to Web site and Web form design, it is obvious that all of the quantitative and much of the qualitative material in The Humane Interface apply to Web work as much as they do to software and hardware design. If you do any work with user interfaces of whatever sort, this book is worth your time.