Exploring Command Chaining in Ubiquity: Part 1

Aug 19, 2009

Since the [dawn of time][1] people have been asking about command chaining in Ubiquity. If you have a translate command and an email command, it would be great to be able to, for example, translate hello to Spanish and email to Juanito. This is what we call command chaining or **[[Pipeline_(Unix) piping]]**: in a single complex query, specifying multiple (probably two) actions and using the first’s output as the second’s input.[^1]

Today I hope to cover some of the technical considerations required in implementing command chaining in Ubiquity, and I will follow up soon with a blog post on the linguistic considerations required as well.

Technical considerations: hooking the pipes together

I’d first like to lay out some technical challenges and questions. These can be broken into two different categories: (1) how the parse and display of suggestions is affected and (2) how the execution is affected.

Matching inputs and outputs

We’ll first consider how command chaining may affect the parsing. Ubiquity commands each specify the types of argument inputs that it expects using different noun types, such as noun_arb_text which accepts anything, noun_type_number which accepts numbers, or noun_type_language, which takes the name of a language. For example, the translate verb takes maximally three arguments: a noun_arb_text object, a noun_type_language goal (the language to translate into), and a noun_type_language source (the source language). In implementing command chaining, it will be necessary to identify the appropriate noun types for the output of a command.

The first question we must address here is “what is the chaining output of a command”? Is it the preview text? Some text output from the execution?

[][2]
Big fish eat da lil fish by joemud, CC-SA-NC

To put this question into perspective, we note that Ubiquity commands can be broadly classified into two types: lookup and action execution. Here’s a classification which I believe to be exhaustive:[^2]

lookup data lookup

classification	preview	execution	example
inserting result into page	`translate`
opening a website	`weather`, most search commands
copying result to pasteboard	`get email address`
nothing	``
action	nothing (maybe a description
of what the action will do)	an action which changes some state
(in the browser or on the web)	`quit firefox`, `email`, `twitter`

In light of this classification I believe we can say that lookup commands are much more likely to be the first verb in a command chain—conversely, chains such as email hello to Blair and then do ... or twitter hello and then ... are quite unlikely. What is much more likely is for the first verb to be a lookup function.

</table>

Thus in the same way that not all commands have a useful execution perhaps only lookup commands will have a chainable output: the results of the lookup. Even with this restriction, we will most likely need to implement a new “chainable output” method or getter in these commands. This means that commands will need to opt-in to become chainable, but I believe this is a necessary evil.

The second question we must address is “when do we establish the noun type of a command’s chainable output?” One unsung but crucial feature of the way Ubiquity works now is that suggestions’ previews are not computed until that suggestion is selected (except for the first suggestion, which in most skins gets previewed immediately). Should we wait for all of the first verbs’ chainable output to be computed and then run them through the noun type detection system? Or should verbs with chainable output also a priori specify what noun types their output will be?

Both of these approaches have their problems. If we compute the chainable output of the first verb, run a noun type detection on it and then suggest the full combination if it matches what the second verb was expecting, this will have clear performance implications, not to mention that it could greatly complicate our parsing algorithm. While the latter approach doesn’t have these performance implications, it does mean that it will have to list (by name or reference) the noun types that will match its output, meaning that if a command author is unaware of someone else’s noun type, that chain will be impossible, even if the chainable output itself does indeed match that noun type. The a posteriori approach would never have this issue. What other benefits or problems do you forsee with either of these approaches? Is there another approach which avoids these pitfalls?

(A)synchronous composability

Once we have the noun types, parsing, and suggestions down, all that remains is to compute the previews and implement the composite execution. Since the Ubiquity command manager already wraps the preview and execute functions in a wrapper to facilitate localization, among other uses, it would be easy to make the command manager compose asynchronous processes pseudo-synchronously. No major changes should be necessary to do the previews and executions, though, again, there will be a performance cost.

Conclusion

There are a number of technical questions which must be answered, mostly in the parsing/suggesting stage. The key questions to answer are:

What is the chaining output of a command?
When do we establish the noun type of a command’s chainable output?

I’ll make another post soon on the linguistic considerations necessary in making command chaining happen in a natural fashion.

We’re going to limit our discussion here to this restriction that the two verbs are not simply two simultaneous commands, but two commands which operate successively on an input, i.e., that it is true piping. This for example rules out input such as google dogs and translate cat to Spanish, as the second command’s execution does not semantically depend on the first’s execution. This (hopefully uncontroversial) decision also affects the linguistic considerations to be made in my next post.

If you know of a command which doesn’t neatly fit into “lookup” or “action”, please let me know.

I believe we should mark these no-execution lookup commands visually so the user does not expect anything to happen if they execute it. This is trac #651.

[1]: http://labs.mozilla.com/2008/08/introducing-ubiquity/ [2]: http://www.flickr.com/photos/joemud/2851415655/

first verb type	second verb type	example
lookup	action	`translate this to Spanish and email to Aza`
lookup	lookup	`translate this to English and then find it with Amazon`
action	action/lookup	no use case?