Writing commands with semantic roles
Thank you to everyone who contributed data to how your language identifies its arguments! The data collection is ongoing so please contribute data points for languages you know!
How Ubiquity identifies its arguments
Currently when writing a command in Ubiquity you must specify two properties for each argument: a modifier (the appropriate adposition—the direct object excluded) and the noun type. Here are some quick examples from the standard commands:
email:
- direct object (
noun_arb_text) to(noun_type_contact)
translate:
- direct object (
noun_arb_text) to(noun_type_language)from(noun_type_language)
This way of specifying arguments has a few shortcomings. First of all, it requires you to identify each type of argument by unique adposition, which does not support languages with case marking nor languages with sets of synonymous adpositions (e.g. French {à la, au, aux}). Second, as we saw in how your language identifies its arguments some languages don’t mark semantic roles on the arguments at all and the current system of specifying arguments is completely incompatible with these languages. Third, the current specification requires command authors to make localized versions of their commands, specifying the language-appropriate modifiers.
In a perfect world the last issue could be solved (at least for languages which mark semantic roles with adpositions) by a mapping of English prepositions to the target language adpositions. Indeed, for some adpositions in some languages this may be possible:
| English/Ubiquity | Chinese | Japanese | |
|---|---|---|---|
| to | => | 到 (dào) | -に (-ni) |
| from | 从 (cóng) | -から (-kara) | |
However, some English prepositions do not cleanly map to a particular adpositions. Take, for example, English “with.” This “with” may map to different markings in Chinese and Japanese depending on the sentence:
| English | Chinese | Japanese | |
|---|---|---|---|
| share with Jono | => | 跟 (gēn) | -と (-to) |
| translate with Google | 用 (yòng) | -で (-de) | |
Note, however, that which set of markings “with” maps to is predictable, as there is a salient semantic difference. The first “with” could be referred to as together-with while the second is a using-with. With this distinction, we can easily predict which paradigm the “with” in “search with Google” should use, because these two “with” arguments represent two different semantic roles.
A proposal: identifying arguments by semantic role1
Suppose commands could specify their arguments by referring to these semantic roles in lieu of adpositions as they currently do. This way, we would be able to automatically map commands into different languages. For example, you could write a new command called move with the following argument structure:
move:
role_object(noun_arb_text)role_goal(noun_type_geolocation)role_source(noun_type_geolocation)
The English mapping of ” => role_object, ‘to’ => role_goal, ‘from’ => role_source could be used to parse the command
move truck from Tokyo to Paris
In addition, with the Japanese mapping of ‘が’ => role_object, ‘に’ => role_goal, ‘から’ => role_source, you could immediately use the command in Japanese as well:
東京からパリにトラックをmoveして
In essence, this proposal would let command authors get their commands localized for free, as long as they stick to a predefined set of semantic roles. For more complex commands and legacy commands, of course, commands could optionally specify particular English modifiers, but then Ubiquity would simply not attempt to localize those commands.
In addition, each language specific parser would determine how to identify its arguments. This would allow languages with case marking or no role marking on arguments at all to handle their own mapping of arguments to semantic roles and still use shared commands. Even parsers such as English would benefit by letting the parser deal with synonymous prepositions and possibly even argument structure alternations (such as English ditransitive alternations).
As a starting point, we could use argument types based on the list of semantic roles given in Fillmore (1971):
- Object: the entity that moves or changes or whose position or existence is in consideration
- Result: the entity that comes into existence as a result of the action
- Instrument: the stimulus or immediate physical cause of an event
- Source: the place from which something moves
- Goal: the place to which something moves
- Experiencer: the entity which receives or accepts or experiences or undergoes the effect of an action …
Comments welcome!
As command authors and Ubiquity users, how do you feel about this proposal? How might this affect, simplify, or complicate the localization of Ubiquity into your language? Thank you in advance! ^^
-
Thank you to Jono and Blair whose comments in our i18n meeting helped shape this proposal. ↩
Related posts:
- Three ways to argue over arguments
- Contribute: how your language identifies its arguments
- Ubiquity in Firefox: Focus on Japanese
- Friendlier command feed subscription
- How natural should a natural interface be?
Related posts brought to you by Yet Another Related Posts Plugin.
Tags: argument structure, arguments, code, coding properties, Mozilla Planet, parser, proposal, semantic role, ubiquity, verbs
If you enjoyed this post, make sure you subscribe to my RSS feed (optionally with tweets from my Twitter)!
February 24th, 2009 at 10:49 am
The localization seems putting a strong accent on the verb defining the semantic roles of the arguments, that's defining the noun type for the (six) roles according to Fillmore.
Identifying arguments by semantic role augments the sentence context, helping to resolve the hyphen problem in the verbs. For instance, I have a command as "yahoo-graph (quote)" getting the Year To Date price chart of the stock. It could replaced with the following new syntax: "chart IBM from yahoo". In my example "IBM" would be a quote and there would be no overwriting with another command using "chart" as verb if it use another argument type.
It seems you have been moving toward NLP approach. Does it make sense building a dictionary, explaining the semantic role of arguments for each verb?
February 24th, 2009 at 2:33 pm
As for the "chart IBM from yahoo," this sounds a lot like Jono's overlord verbs proposal. Have you looked at that?
As for the dictionary… can you elaborate what you mean by that?
February 24th, 2009 at 3:24 pm
Yes, I read Jono's post and your proposal works nice with overlord verbs (of course). I gave a command example of mine, showing how overlord verb and semantic role affect a command.
From the point of the user view, Ubiquity manages discoverability and conflicts (or choices) with suggestions… but, from the point of the command author, there are the open issues of the Jono's post (naming standard, name mangling, etc.) and the semantic roles for each verb.
My proposal is a dictionary, a reference book, containing the information (semantic roles) to create correctly a new (overlord) verb.
February 25th, 2009 at 2:04 am
Thanks Alberto. I see what you mean by the dictionary… I'm sure if/when these semantic roles and the overlord system is put into place, we'll be rewriting/refining the command authoring manual.
March 18th, 2009 at 3:18 am
[…] The Next Generation. The new architecture is designed to support (1) the use of overlord verbs, (2) writing verbs by semantic roles, and (3) better suggestions for verb-final languages and other argument-first contexts. I’m […]
April 1st, 2009 at 3:23 am
[…] which will bring some great new features to Ubiquity, including support for overlord verbs and semi-automatic localization of commands via semantic roles. It’s possible, though, that these new features will break backwards compatibility of the […]
April 9th, 2009 at 7:08 am
[…] the Parser The Next Generation into Ubiquity proper, and this of course involves the process of retooling the standard commands with semantic roles. The first step, however, is to come up with a list of universal semantic roles which the verbs […]
April 24th, 2009 at 6:46 am
[…] more about different features touched on in this video: * The design document for the new parser * Writing commands with semantic roles and a proposed inventory of semantic roles * Some thoughts on noun-first suggestions and Ubiquity […]
April 26th, 2009 at 6:31 am
[…] s the verb’s specified nountype matching the argument noun wellhaving to suggest the verb the verb in the input matching the verb wellmultiple arguments parsed for a single semantic role the verb being used oftenthe verb missing some arguments […]
May 7th, 2009 at 5:01 am
[…] At first glance, strongly case-marked languages may look like a godsend for identifying the semantic roles of arguments.2 If we can easily and unambiguously recognize arguments’ cases to put them in their appropriate semantic roles, this could […]
May 11th, 2009 at 5:59 am
[…] Writing commands with semantic roles […]
June 4th, 2009 at 1:09 pm
I'd prefer a dynamic dictionary that is built by checking what overlord words that the installed commands has declared that they use.
Check my comments here: http://groups.google.com/group/ubiquity-firefox/b…
July 28th, 2009 at 6:01 am
I'm sure if/when these semantic roles and the overlord system is put into place, we'll be rewriting/refining the command authoring manual.
July 28th, 2009 at 6:12 am
conflicts (or choices) with suggestions… but, from the point of the command author, there are the open issues of the Jono's post (naming standard, name mangling, etc.) and the semantic roles for each verb.
July 28th, 2009 at 11:36 am
interesting article management diploma
October 14th, 2009 at 1:08 pm
All these writing commands can easily help for language users. I must say, useful tips!
January 11th, 2011 at 6:29 pm
I don't know very much about languages, but this seems like a very reasonablesolution to the problem at hand. It's not confusing for developers and it would immediately give us the ability to have synonyms for the different roles which would make Ubiquity much more user friendly.
This would, however, mean that a lot more processing would go into the different parsers…but since it saves us command writers more time and give us great benefits (and we don't have to write it), I'm all for it =P