mitcho Michael 芳貴 Erlewine

Linguist. Fifth year PhD student at MIT.


Posts Tagged ‘verbs’

Exploring Command Chaining in Ubiquity: Part 1

Wednesday, August 19th, 2009

Since the dawn of time people have been asking about command chaining in Ubiquity. If you have a translate command and an email command, it would be great to be able to, for example, translate hello to Spanish and email to Juanito. This is what we call command chaining or [[Pipeline_(Unix)|piping]]: in a single complex query, specifying multiple (probably two) actions and using the first’s output as the second’s input.1

Today I hope to cover some of the technical considerations required in implementing command chaining in Ubiquity, and I will follow up soon with a blog post on the linguistic considerations required as well.


  1. We’re going to limit our discussion here to this restriction that the two verbs are not simply two simultaneous commands, but two commands which operate successively on an input, i.e., that it is true piping. This for example rules out input such as google dogs and translate cat to Spanish, as the second command’s execution does not semantically depend on the first’s execution. This (hopefully uncontroversial) decision also affects the linguistic considerations to be made in my next post. 

Ubiquity Commands by The Numbers

Wednesday, April 1st, 2009

Recent work in the Ubiquity internationalization realm has focused on the upcoming Ubiquity parser which will bring some great new features to Ubiquity, including support for overlord verbs and semi-automatic localization of commands via semantic roles. It’s possible, though, that these new features will break backwards compatibility of the current command specification and noun types. [[Creative destruction]] for the win.

As we look to move forward with incorporating the next generation parser into Ubiquity proper, it thus becomes important to take a look at the current command ecosystem to see how possibly disruptive this move will be. To this end last night I wrote a quick perl script to scrape the commands cached on the herd and get some quantitative answers to my questions.


Where’s The Verb?

Wednesday, March 25th, 2009

Ubiquity’s proposed new parser design is based on a [[principles and parameters]] philosophy: we can build an underlying universal parser and, for each individual language, we simply set some “parameters” to tell the parser how to act. As we consider the design’s pros and cons, it’s important to reflect back on the linguistic data and see if this architecture can adequately handle the range of linguistic data attested in our languages.

Today I’ll examine highlight some disparate typological data to help us understand these questions: where’s the verb? and what does the verb look like? (more…)

Ubiquity Parser: The Next Generation Demo

Wednesday, March 18th, 2009


A week or two ago while visiting California, Jono and I had a productive charrette, resulting in a new architecture proposal for the Ubiquity parser, as laid out in Ubiquity Parser: The Next Generation. The new architecture is designed to support (1) the use of overlord verbs, (2) writing verbs by semantic roles, and (3) better suggestions for verb-final languages and other argument-first contexts. I’m happy to say that I’ve spent some time putting a proof-of-concept together.

I’ve implemented the basic algorithm of this parser for [[left-branching]] languages (like English) and also implemented some fake English verbs, noun types, and semantic roles. This demo should give you a basic sense of how this parser will attempt to identify different types of arguments and check their noun types even without clearly knowing the verb. This should make the suggestion ranking much smarter, particularly for verb-final contexts. (For a good example, try from Tokyo to San Francisco.)

➔ Check out the Ubiquity next-gen parser demo


Writing commands with semantic roles

Tuesday, February 24th, 2009

Thank you to everyone who contributed data to how your language identifies its arguments! The data collection is ongoing so please contribute data points for languages you know!

How Ubiquity identifies its arguments

Currently when writing a command in Ubiquity you must specify two properties for each argument: a modifier (the appropriate [[adposition]]—the direct object excluded) and the noun type. Here are some quick examples from the standard commands:


  • direct object (noun_arb_text)
  • to (noun_type_contact)


  • direct object (noun_arb_text)
  • to (noun_type_language)
  • from (noun_type_language)

This way of specifying arguments has a few shortcomings. First of all, it requires you to identify each type of argument by unique adposition, which does not support languages with [[case marking]] nor languages with sets of synonymous adpositions (e.g. French {à la, au, aux}). Second, as we saw in how your language identifies its arguments some languages don’t mark semantic roles on the arguments at all and the current system of specifying arguments is completely incompatible with these languages. Third, the current specification requires command authors to make localized versions of their commands, specifying the language-appropriate modifiers.


Friendlier command feed subscription

Monday, February 23rd, 2009


If you’ve ever subscribed to a new Ubiquity command before, you know the red screen of doom. Ubiquity currently takes users to this page every time they wish to subscribe to a new command. The current design is meant to encourage users to be aware of the possible security implications of enabling and executing a command, to avoid getting a [[trojan horse]].

The current screen, however, does not make subscribing to commands foolproof. I personally know I’ve subscribed to a number of commands without reading through the code, defeating the purpose of the anti-trojan horse display. Moreover, the page doesn’t give you any information on how you can use this new command. Especially given the inherent limited discoverability of a natural language interface, taking a moment to help the user actually learn the command becomes key.

Today I did a quick mockup of what a friendlier command feed subscription page might look like. Take a look at this screenshot with some of the features marked:


You can also check out the page itself. If you’d like to visualize it without the “trust” warning, you can also view the trusted version.

This mockup here is but a first iteration. What do you think about this subscription page? What is missing? What should be changed?

Ubiquity in Firefox: Focus on Japanese

Friday, February 20th, 2009

One of the eventual goals of the Ubiquity project is to bring some of its functionality and ideas to Firefox proper. To this end, Aza has been exploring some possible options for what that would look like (round 1, round 2). All of his mockups, however, use English examples. I’m going to start exploring what Ubiquity in Firefox might look like in different kinds of languages. Let’s kick this off with my mother tongue, Japanese.1

今後多様な言語に対応したFirefox内のUbiquityを検討していきますが、その中でも今日は日本語をとりあげます。後日日本語で同じ内容を投稿するつもりです。^^ 日本語でのコメントも大歓迎です!


Three ways to argue over arguments

Wednesday, February 18th, 2009

UPDATE: Contribute information on how your language identifies its arguments here.

When we execute a command in Ubiquity, in very simple terms, we’re hoping to do something (a verb) to some arguments (the nouns). Every sentence in every language uses some method to encode which arguments correspond to which roles of the verb. Here are a couple examples:

  1. He sees Mary.
  2. 彼が Maryを 見る。 (Kare-ga Mary-o miru.)

As speakers of English, you can read sentence (1) above and know exactly who is doing the seeing and who is being seen and speakers of Japanese can get the same information from (2). How do different languages code for arguments in different roles? There are, broadly speaking, three different ways:

three ways to code for arguments in different roles

We’ll take a brief look today at these three different strategies, all of which a localizeable natural language interface will surely encounter.