Where’s The Verb?
Ubiquity’s proposed new parser design is based on a principles and parameters philosophy: we can build an underlying universal parser and, for each individual language, we simply set some “parameters” to tell the parser how to act. As we consider the design’s pros and cons, it’s important to reflect back on the linguistic data and see if this architecture can adequately handle the range of linguistic data attested in our languages.
Today I’ll examine highlight some disparate typological data to help us understand these questions: where’s the verb? and what does the verb look like? There are broadly three different verb forms taken in commands in different languages:1
- the infinitive,
- subjunctive mood, or
- a special verb form such as imperative, participial, or conjunctive (such as Japanese て form)
Let’s give an example of each:
Infinitive (English):2
1 | Hit me! |
Subjunctive mood (Modern Greek): “Eat it all!”
2 3 | Na to fas olo! SUBJ it eat all |
Imperative form (French): “Eat it!”
4 5 | Mange -le! eat.IMP it |
It’s important to note that some languages have multiple forms available for the same command. For example:
Dutch: three ways to say “watch out!” with the same verb
- Infinitive:
Oppassen! - Imperative:
Pas op! - Participial:
Opgepast!
Similarly, I received a great comment by PhiliKON on German and associated data by Robert Kaiser on my blog post yesterday:
German: “search hello with google”
- Infinitive:
hello mit google suchen - Imperative:
suche hello mit google
In addition, German and Dutch are interesting as they are verb second (V2) languages, so the verb may surface at the beginning or the end of the sentence, depending on the form.
The new parser design (which you can demo) assumes for simplicity that the verb should be found at the beginning or the end of the input, which is consistent with the data I’ve seen (modulo clitics). Multiple verb forms could be accounted for by supporting “synonyms” of the verbs.
What are the different ways verbs are expressed in commands in your language? Is the verb always found at the beginning or the end of the sentence? Is it ever somewhere in the middle?
-
Some of the data and theoretical support for this section comes from, among other sources, Sabine Iatridou’s De Modo Imperativo lecture notes. ↩
-
Many refer to this in English as an “imperative form,” but in Modern English this is arguably the same as the infinitive. ↩
Related posts:
- Exploring Command Chaining in Ubiquity: Part 2
- Big Issues and Small Issues with Parser 2
- Ubiquity i18n: questions to ask
- Ubiquity Parser: The Next Generation Demo
- Contribute: how your language identifies its arguments
Related posts brought to you by Yet Another Related Posts Plugin.
Tags: commands, infinitive, linguistics, Mozilla Planet, parser, subjunctive, typology, ubiquity, verb-final, verbs
If you enjoyed this post, make sure you subscribe to my RSS feed (optionally with tweets from my Twitter)!
March 25th, 2009 at 8:27 am
Russian has both the infinitive and the imperative form though the latter would probably be more natural for Ubiquity. The verb comes first in both cases:
Infinitive: Искать адрес на Гугле Imperative: Ищи адрес на Гугле
March 25th, 2009 at 9:34 am
Again, my question is: how are you going to deal with agglutinative or highly inflecting languages, where morphological analysis seems unavoidable? English has a very reduced grammar, so you can figure most things out from the word order. But e.g. in Hungarian, the word order gives little information.
March 25th, 2009 at 10:48 am
Just a small correction on your example of an imperative form : "do it" in French would be "fais-le". Very interesting series of posts, BTW.
March 25th, 2009 at 11:26 am
Your French example for "Do it!" should be "Fais-le !" (informal) or "Faites-le !" (formal).
When talking to a computer though, I think people would rather type "Faire xxx" (infinitive) than use an imperative. The infinitive would actually imply "I want you to …".
I'll try to complete your survey for French but I feel it's going to be complicated
March 25th, 2009 at 12:22 pm
Olivier, Benoit, sorry about that French example. It was wrong but also a bad example… so I changed it to "mange-le!" This is a better example as you actually see that there's a separate imperative form, contrasting with the regular second person singular conjugation ("manges"). Thanks for the correction!
@Benoit - your point about the infinitive is duly noted… we definitely want to be able to support these other "natural" options as well. I'd love to see you fill out the survey!
March 25th, 2009 at 12:24 pm
@Szabolcs, thanks for the comment again. It's a great question—an important issue—and I'll hope to dedicate a blog post to agglutinative case marking in the near future. If you want me to take a look at Hungarian in particular, I'd appreciate it if you (or another speaker) could reply to the survey posted a couple days ago.
March 25th, 2009 at 12:40 pm
In italian we use the imperative form (ex. cerca [search it], traduci [translate it]) and the verb always comes first (ex. cerca-lo su google [search it in google])
March 25th, 2009 at 3:56 pm
@mitcho, I'll reply to the survey tomorrow. Until then, here are a few examples to illustrate what I mean:
translate this from English to French fordítsd ezt angolról franciára
translate this from French to English fordítsd ezt franciáról angolra
frodítsd = translate, 2nd person sg objective imperative (the presence of the object is marked on the verb) ezt = this, objective francia = French fraciáról = from French fraciára = to French
Note how the ending of "francia" changes, also influencing the last vowel of the word.
Another example:
send mail to Kata küldj levelet Katának
Note that the name Kata took a suffix, and the last vowel was changed: "Katának".
If we have a non-Hungarian name, e.g. Pete, than it would be
küldj levelet Pete-nek küldj levelet Petenek
Note that the final letter did not change here (in fact even though we write that E, no final vowel is pronounced). The proper way to attach suffixes to foreign names like this is to use a hyphen, "Pete-nek", but I'm not sure that people would be willing to do that consistently when typing quickly.
Now let's look at sentences like this one:
translate HELLO to French
These are a bit more complicated, and I am not yet sure how to translate them. The problem is that the object must get a suffix, but since in this case the object can be a longer piece of text, we cannot attach a suffix, so we have to use a more complicated construction …
fordítsd franciára azt, hogy HELLO fordítsd azt fraciára, hogy HELLO
Other word orders are possible too, but I think that people are more likely to use one of these two in this particular context (word order can be used to stress a part of the sentence). Anyway, this looks too complicated, so the following might work better:
fordítsd franciára: HELLO
Literally,
translate (it/this) to french: HELLO
(I put "it/this" in parens there to show that the number and person of both the subject and the object are marked on the verb. But this isn't really relevant as the same verb is unlikely to be used both with and without and object in Ubiquity)
Another complication might be the use of verb-particles (I'm not sure what these are called in English). Another (very common) form of the verb "translate", "fordít", is "lefordít", with the particle "le". In this particular case the meaning of the two words is pretty much the same, but in most cases the particle is essential and cannot be omitted. The problem is that the particle will get detached in imperative:
translate this to French fordítsd ezt le franciára fordítsd le ezt franciára
Both word orders are equally likely for this context.
Mitcho, do you think that these problems can be solved?
March 25th, 2009 at 7:35 pm
It is arguable that the infinitive form in English is the form with 'to', while the imperative is the format without it. But, for the purposes of Ubiquity, that distinction probably doesn't matter.