blog

Exploring Command Chaining in Ubiquity: Part 1

Since the dawn of time people have been asking about command chaining in Ubiquity. If you have a translate command and an email command, it would be great to be able to, for example, translate hello to Spanish and email to Juanito. This is what we call command chaining or piping: in a single complex query, specifying multiple (probably two) actions and using the first’s output as the second’s input.1

Today I hope to cover some of the technical considerations required in implementing command chaining in Ubiquity, and I will follow up soon with a blog post on the linguistic considerations required as well.

Technical considerations: hooking the pipes together

I’d first like to lay out some technical challenges and questions. These can be broken into two different categories: (1) how the parse and display of suggestions is affected and (2) how the execution is affected.

Matching inputs and outputs

We’ll first consider how command chaining may affect the parsing. Ubiquity commands each specify the types of argument inputs that it expects using different noun types, such as noun_arb_text which accepts anything, noun_type_number which accepts numbers, or noun_type_language, which takes the name of a language. For example, the translate verb takes maximally three arguments: a noun_arb_text object, a noun_type_language goal (the language to translate into), and a noun_type_language source (the source language). In implementing command chaining, it will be necessary to identify the appropriate noun types for the output of a command.

The first question we must address here is “what is the chaining output of a command”? Is it the preview text? Some text output from the execution?

2851415655_1012a4cce0_o.jpg
Big fish eat da lil fish by joemud, CC-SA-NC

To put this question into perspective, we note that Ubiquity commands can be broadly classified into two types: lookup and action execution. Here’s a classification which I believe to be exhaustive:2

classificationpreviewexecutionexample
lookupdata lookupinserting result into pagetranslate
opening a websiteweather, most search commands
copying result to pasteboardget email address
nothing
actionnothing (maybe a description
of what the action will do)
an action which changes some state
(in the browser or on the web)
quit firefox, email, twitter

In light of this classification I believe we can say that lookup commands are much more likely to be the first verb in a command chain—conversely, chains such as email hello to Blair and then do ... or twitter hello and then ... are quite unlikely. What is much more likely is for the first verb to be a lookup function.

first verb typesecond verb typeexample
lookupactiontranslate this to Spanish and email to Aza
lookuplookuptranslate this to English and then find it with Amazon
actionaction/lookupno use case?

Thus in the same way that not all commands have a useful execution3 perhaps only lookup commands will have a chainable output: the results of the lookup. Even with this restriction, we will most likely need to implement a new “chainable output” method or getter in these commands. This means that commands will need to opt-in to become chainable, but I believe this is a necessary evil.

The second question we must address is “when do we establish the noun type of a command’s chainable output?” One unsung but crucial feature of the way Ubiquity works now is that suggestions’ previews are not computed until that suggestion is selected (except for the first suggestion, which in most skins gets previewed immediately). Should we wait for all of the first verbs’ chainable output to be computed and then run them through the noun type detection system? Or should verbs with chainable output also a priori specify what noun types their output will be?

Both of these approaches have their problems. If we compute the chainable output of the first verb, run a noun type detection on it and then suggest the full combination if it matches what the second verb was expecting, this will have clear performance implications, not to mention that it could greatly complicate our parsing algorithm. While the latter approach doesn’t have these performance implications, it does mean that it will have to list (by name or reference) the noun types that will match its output, meaning that if a command author is unaware of someone else’s noun type, that chain will be impossible, even if the chainable output itself does indeed match that noun type. The a posteriori approach would never have this issue. What other benefits or problems do you forsee with either of these approaches? Is there another approach which avoids these pitfalls?

(A)synchronous composability

Once we have the noun types, parsing, and suggestions down, all that remains is to compute the previews and implement the composite execution. Since the Ubiquity command manager already wraps the preview and execute functions in a wrapper to facilitate localization, among other uses, it would be easy to make the command manager compose asynchronous processes pseudo-synchronously. No major changes should be necessary to do the previews and executions, though, again, there will be a performance cost.

Conclusion

There are a number of technical questions which must be answered, mostly in the parsing/suggesting stage. The key questions to answer are:

  1. What is the chaining output of a command?
  2. When do we establish the noun type of a command’s chainable output?

I’ll make another post soon on the linguistic considerations necessary in making command chaining happen in a natural fashion.


  1. We’re going to limit our discussion here to this restriction that the two verbs are not simply two simultaneous commands, but two commands which operate successively on an input, i.e., that it is true piping. This for example rules out input such as google dogs and translate cat to Spanish, as the second command’s execution does not semantically depend on the first’s execution. This (hopefully uncontroversial) decision also affects the linguistic considerations to be made in my next post. 

  2. If you know of a command which doesn’t neatly fit into “lookup” or “action”, please let me know. 

  3. I believe we should mark these no-execution lookup commands visually so the user does not expect anything to happen if they execute it. This is trac #651

Related posts:

  1. Exploring Command Chaining in Ubiquity: Part 2
  2. Command Chaining with Oni?
  3. Ubiquity Commands by The Numbers
  4. Ubiquity Parser: The Next Generation Demo
  5. Count command for Ubiquity

Related posts brought to you by Yet Another Related Posts Plugin.

Tags: , , , , , , ,

If you enjoyed this post, make sure you subscribe to my RSS feed (optionally with tweets from my Twitter)!

11 Responses to “Exploring Command Chaining in Ubiquity: Part 1”

  1. Jono Says:

    Hi Mitcho, I think you're basically right with your classification. I think that the most logical way for command authors to implement output values is by returning an object from their execute() method. (And their preview() method?). The return value of the execute() method isn't currently used for anything, and it's a logical place to emit output values.

    I think almost every output value will be either generic text, or a URL. Certainly, all commands that open pages could be pretty trivially modified to return those URLs, perhaps wrapped as some kind of URL object with type info, which can easily be consumed as input by one of the arguments of another command. And commands that either set the text selection, or display text output in a transparent message, or put text into the clipboard, could all be changed to output that text as the return value of execute().

    A thought: Perhaps instead of the command.execute() explicitly calling openURL, or setTextSelection, the command.execute() should simply return a URL object or a text object. If there's a next command in the chain, it will get that value; if there is not a next command on the chain, then the Ubiquity core will catch it and perform the "default action" based on the data type. The default action would be to set the text selection in the case of text, or to open a URL in the case of a URL.

    I have been assuming before that any command which wants to allow chaining would have to specify the nountype of its output. Your alternative (running nountype detection on the output values) is intriguing as it has a lot of flexibility advantages, but it occurs to me that in a lot of cases we may not have any output to run nountype detection on until after the command is executed, at which point it's too late to chain it with anything.

  2. aaron Says:

    I don’t know a lot about the internals of Ubiquity, so something I say may be wrong, but it seems like the main uses for custom noun types are completion and imposing structure on input. Argument completion isn’t an issue for commands that are not the first in a chain. (As you point out, verb suggestion is a problem, and not one I have much of an idea for). So, for imposing structure, what if you let commands specify a function that transforms another input type into their preferred type. Concretely, I imagine this being implemented as an associative array of noun types and input functions. So the command “multiply” might specify a dict as follows (pseudocode): { noun_type_number: function(x) { return x; }, noun_type_text: function (x) { find the first number in x and return it; } }

    This would allow chaining of less than fully compatible commands. Take the following example: “get the number of wugs and then multiply by 2” Imagine “multiply” is expecting a number as input, but “get the number of” returns a string like “There are 5 wugs”. But by calling the noun_type_text member of “multiply“‘s input function dictionary, you can recover the number.

    This would be bad for intelligent verb suggestion, of course, because presumably any command is going to make an effort to do something with arbitrary text (and most command output can be down-converted to text). I guess you’d have to allow commands to specify a “preferredness” value for each noun type they accept. So a command that accepts email addresses would be a good match for a command that outputs email addresses; a mediocre match for a command that outputs text (because presumably it has some way of fishing an email address out of a larger string); and a poor match for a command that outputs numbers (because no number —> email address conversion is defined)

    A further thought: If you impose some API requirements on validation/conversion functions, you can get some extra mileage. First, argument validation functions must not do anything side effectful, but only validate. Second, they must signal some kind of error when validation fails. Then you can run them in the background after the first command is entered. So to return to the above example, if the user types “get the number of wugs and then” and pauses typing, Ubiquity can use the idle time to call commands’ validation functions. The function for validating text input to “multiply” will result in success, because there is a number in “There are 5 wugs.” A function for validating email addresses will fail, though, because there aren’t any email addresses. Then Ubiquity could cull email-address-accepting functions from the list of suggestions.

  3. mitcho Says:

    Aaron, thanks for commenting! Our noun type detection system, together with the way Ubiquity commands specify their arguments, does exactly what you describe with noun suggestions being returned with scores which represent how well that input matches that noun type. This is used both for validation of individual arguments and for suggest suggestions of verbs. More information is available at http://mitcho.com/blog/projects/judging-noun-type... . An old demonstration of the smart verb suggestion is at http://mitcho.com/blog/projects/a-demonstration-o... and details of our scoring model is at http://mitcho.com/blog/observation/scoring-for-op... .

    I believe what you're describing in your last paragraph is the a posteriori approach (noun type detection after getting the chainable output of the first verb) I described above.

    Nice casual use of "wugs," btw. ;)

  4. mitcho Says:

    Aaron, thanks for commenting! Our noun type detection system, together with the way Ubiquity commands specify their arguments, does exactly what you describe with noun suggestions being returned with scores which represent how well that input matches that noun type. This is used both for validation of individual arguments and for suggest suggestions of verbs. More information is available at http://mitcho.com/blog/projects/judging-noun-type... . An old demonstration of the smart verb suggestion is at http://mitcho.com/blog/projects/a-demonstration-o... and details of our scoring model is at http://mitcho.com/blog/observation/scoring-for-op... .

    I believe what you're describing in your last paragraph is the a posteriori approach (noun type detection after getting the chainable output of the first verb) I described above.

    Nice casual use of "wugs," btw. ;)

  5. Felipe Gomes Says:

    — Attention, wild brainstorming in this comment :)

    Command chaining is something really interesting. Back 4 or 5 years when the ideas of natural language interfaces were popping up, there was the service called YubNub (http://www.yubnub.org, still exists) and it was really simple but worked really well. It's hard to image up front all the new possibilities of chaining brings, but when pieces starts tying together it gets interesting. When they implemented both named parameters and command chaining, lots of commands and meta-commands appeared (like ifThenElse) that made it almost a pseudo-script language for the web. However, their approach and goals were quite different (natural language was not the final goal), so I don't think that everything applies to Ubiquity, but I just wanted to mention it to bring some inspiration.

    One idea for how the chaining could work is: when the user types a composite command (let's simplify it by being detected as the word "and" directly followed by a noun verb), only the first command would be interpreted, the second part would be left ignored. On the matches list (where you do the nice underlining of noun-types and such) the second command would be displayed grayed out. When the user is certain that the first command previewed is correct, then pressing Enter or the right arrow key would execute the command, get the lookup data, and move to the next command in the chain. Or perhaps it wouldn't execute the command but just let the user adjust the second command and use a placeholder for the output (similar to the "this" placeholder… could be a nice little box saying "output").

    This is all pretty much English only I think (though it would work for portuguese), but I guess the model would break in other languages so I'm excited to see what you'll say about the international challenges about this.

    Another thing is that there are some possible action commands that would be chainable, or at least considering that some action commands would have output data. For example, a file uploader command: "upload picture to flickr and e-mail it to mitcho" :) Of course this is just crazy future imagination

  6. Action! Says:

    twitter g'night and quit firefox

  7. Zeno Davatz Says:

    ;)

  8. mitcho Says:

    Heh. :p

  9. LinuxMCE Phone System Demonstration Part 1/2 | Business Telephone System Says:

    […] Exploring Command Chaining in Ubiquity: Part 1 […]

  10. Exploring Command Chaining in Ubiquity: Part 2 Says:

    […] A few days ago I penned some initial technical considerations regarding command chaining. In this post I’ll be point out some linguistic considerations involved in supporting a natural syntax for chaining. […]

  11. FLV Converter Says:

    That 's so beautiful!


© 2006-2010 mitcho (Michael 芳貴 Erlewine).
Proudly powered by WordPress.
Entries (RSS) and Comments (RSS).
The views expressed on these pages are mine alone and do not
reflect those of my employers and clients, past and present.