How natural should a natural interface be?

Feb 16, 2009

I’m very happy to announce that, starting today, I will be working full-time on Ubiquity, a Mozilla Labs experiment to connect the web with language. I’ll be heading up research on different linguistic issues of import to a linguistic user interface and blogging about these topics here. If you’re interested, please subscribe to my blog’s RSS feed or the RSS feed for only Ubiquity-related items. Commenting is encouraged! ^^

Every day, more users are trying out Ubiquity, the Mozilla Labs experiment that lets users accomplish common Internet tasks faster through a natural language interface. As we live more and more of our lives on the web, there is a huge appeal to—and need for—a faster way to access and mashup our information.

But what exactly do we mean by a “natural language interface”? Is it just another programming language with lots of English keywords? Should the final goal be a computer that understands everything we tell it?

Ubiquity is not HAL

As we think about the future directions and possibilities of Ubiquity, we need to go back to our roots and understand the project’s motivations. With that in mind, here are some initial thoughts on the advantages of a natural language interface. The ultimate goal here is to refine the notion of natural language interface and to come up with a set of principles that we can follow in pushing Ubiquity further, into other languages and beyond.

Why language?

In his 2008 article in interactions, Aza describes a clear need for modern UI to move beyond monolithic do-everything apps into efficient, granular commands that can be connected to accomplish tasks. Hierarchical menus with an application’s every possible function are great for discoverability, but slow and inefficient as they grow. Aza advocates for the use of a familiar subset of natural language to this end. In his own words,

Words can capture abstractions that pictures cannot because language has an immense amount of descriptive and differentiating power. Abstract thoughts are exactly represented by the words that give them names. It is this power that comes to the rescue in specifying functionality.

In other words, language gives us the descriptive power to succinctly and creatively express our will, far faster than a series of menus, and with more freedom than a series of shortcuts or gestures. In addition, by tapping into the lexicon of our every day language, we make a direct attack on the learnability problem.¹

The natural syntax test

The ability to string different commands together is not a novel one—indeed, this is what more traditional command lines and programming languages offer. However, these technologies present a huge barrier to the layperson, even for languages with many keywords from English or English-like syntax.

Programming languages can be such teases in this way. Often the first bits of code in a language look remarkably similar to natural language ([[Python]]):

print "Hello World"

…but the young coder is quickly disappointed:

print map(lambda x: x*2, [1,2,3])

[[AppleScript]] is a language which tries to take this idea further and, indeed, sometimes AppleScript code constitutes readable English.

print pages 1 thru 5 of document 2

Dig a little deeper, though, and AppleScript also fails the “natural syntax” test. In fact, it can be argued that a language that looks like a natural language but differs in some important details can be even more difficult to use than one that is completely novel. Bill Cook, one of the original developers of [[AppleScript]], makes this point in his history of AppleScript: “in hindsight, it is not clear whether it is easier for novice users to work with a scripting language that resembles natural language, with all its special cases and idiosyncrasies.”

If the interface’s syntax is too restrictive or, worse, conflicts with a user’s natural intuitions about their natural language, it immediately fails to be “natural”, no matter how similar the keywords or grammar is.²

Towards a natural (and forgiving) syntax

Aza similarly laments the relegation of text-based interfaces to the higher echelons of geekdom in his 2008 paper: “if commands were memorable, and their syntax forgiving, perhaps we wouldn’t be so scared to reconsider these interface paradigms.”

The key word “forgiving” above (emphasis mine) is two-ways ambiguous, both of which we want a natural language interface to be:

Forgiving as in “not difficult to learn and remember”: the syntax must be easy and natural for the user, encouraging experimentation and intuitive application;
Forgiving as in “not correcting or prescriptive”: the system should try its darndest to accept the user’s input, even if it’s not the most “well-formed.”

From an implementation point of view, (2) above can also be an advantage. There are many grammatical restrictions in natural language which, as long as the command is unambiguous, Ubiquity need not enforce on the user. Take, for example, the two statements:

print two copy
print two copies

I feel that Ubiquity should execute both of these statements with equal ease. The numeral “two” makes the user’s intent very clear, even though the plural of “copy” should indeed be “copies.” It need not be the job of the interface to decide whether a sentence is “correct English.” By assuming that the user is trying to communicate a valid and possible task, rather than throwing up an error, the system will be more flexible and more forgiving in the inevitable case of human error. The ultimate goal should be to help the user accomplish their task.

Conclusion

By developing a language interface which truly feels natural to the user, we can successfully bring the power of text-based interfaces to the masses. I feel the key to this “natural-ness” is a less restrictive and in fact forgiving syntax. While this goal akin to [[natural language programming]] may be daunting from an implementation angle, and it may indeed prove impossible, as long as the goal is to execute simple imperative commands, the scope of the target syntactic structures is limited.

Ubiquity as it stands is many different things for many people. The natural language guidelines above may feel too restrictive to many current developers for whom Ubiquity is simply a convenient new way to extend Firefox.³ This discussion also seems orthogonal to the mouse-based Ubiquity experiments. As users and developers, how do you feel about the potential benefits and downsides of these natural syntax guidelines? In the coming days I’ll look at some concrete examples of what this “forgiving” syntax would demand of Ubiquity.

The learnability problem of a linguistic interface, particularly in light of the usability vs. discoverability paradigm, is a topic for a future post. ;) ↩
It’s important to note that the “restrictions” I’m concerned with here are syntactic ones, not lexical ones. That is, if either of the Ubiquity commands below fail because we don’t have a “pass” verb, that’s fine. But if Ubiquity can only allow one string but not the other, that’s a syntactic restriction which goes against our English intuition. <pre lang='ubiquity' line='1'>pass Jono the ball pass the ball to Jono</pre> I’ll cover this in a future post. ↩
In fact, I myself am also guilty of this… my select command for SQL queries clearly does not encourage a natural language-compatible syntax. ↩