blog

Posts Tagged ‘French’

Solving Another Romantic Problem: Weak Pronouns

Tuesday, May 12th, 2009

Yesterday I blogged on how to deal with portmanteau’ed prepositions in Ubiquity Parser 2, a common problem in various romance languages. Today I’ll propose an approach to another romantic problem.

The problem:

Weak pronouns in romance languages (as well as some other languages) have a special property where they cliticize to the verb, moving from its regular argument position to a position next to the verb. For example, in French, we have an imperative like (1) with gloss as (2):

1
2
Envoyez  le  lettre à  Pierre!
send.IMP the letter to Pierre

If we replace le lettre or à Pierre with a preposition (le, “it”, or lui, “to him”, respectively), those weak pronouns move next to the verb—in particular, (5) exemplifies the change in word order. Replacing both arguments with prepositions creates the stacked clitic form of (7).1

3
4
5
6
7
8
Envoyez-la à  Pierre!
send   -it to Pierre
Envoyez-lui la  lettre!
send   -him the letter
Envoyez-le-lui!
send   -it-him

The fact that these weak pronouns are attached to the verb and lack separate delimiters mean that we will need a separate mechanism to parse these arguments: indeed, this functionality has been planned in Ubiquity Parser 2 as “step 3”. Here I’ll examine some data and discuss a strategy for the parsing of weak pronouns.

(more…)


  1. Note that the reverse order of “Envoyez-lui-le” is ungrammatical… fortunately we most likely will not have to deal with multiple clitics… see footnote two below. 

Solving a Romantic Problem: Portmanteau’ed Prepositions

Monday, May 11th, 2009

The problem:

In many romance languages, prepositions and articles often form portmanteau morphs, combining to form a single word.1 Some examples include (French) à + le > au, de + le > du, (Catalan) a + el > al, de + les > dels, per + el > pel. Italian has a particularly productive system of portmanteau’ed prepositions and articles… I refer you to the contraction article on Wikipedia.

As I noted a couple weeks ago, however, some combinations do not form portmanteaus.2

(more…)


  1. Thanks to Jeremy O’Brien for helping me figure out how to refer to this phenomenon. 

  2. This also relates to the issue of parsing multi-word delimiters, though the argument normalization strategy covered here should reduce the necessity of multi-word delimiters. 

Adding Your Language to Ubiquity Parser 2

Wednesday, April 29th, 2009

NOTE: This blog post has now been added to the Ubiquity wiki and is updated there. Please disregard this article and instead follow these instructions.

You’ve seen the video. You speak another language. And you’re wondering, “how hard is it to add my language to Ubiquity with Parser 2?” The answer: not that hard. With a little bit of JavaScript and knowledge of and interest in your own language, you’ll be able to get at least rudimentary Ubiquity functionality in your language. Follow along in this step by step guide and please submit your (even incomplete) language files!

As Ubiquity Parser 2 evolves, there is a chance that this specification will change in the future. Keep abreast of such changes on the Ubiquity Planet and/or this blog (RSS).

(more…)

Foxkeh demos Ubiquity Parser: The Next Generation

Wednesday, April 1st, 2009

I just made a screencast with Foxkeh to demo the Ubiquity next generation parser demo and to demonstrate how easy it is to add your own language. Foxkeh wants you to localize the parser into your language. How could you say no? ^^


Foxkeh demos Ubiquity Parser: The Next Generation from mitcho on Vimeo.

There are some details which are not covered in this introductory video, such as how to deal with case marking languages or languages without spaces. Hopefully this’ll inspire some people to play with the demo, though. I’d love to hear your comments! ^^

This week on Ubiquity Parser: The Next Generation

Friday, March 27th, 2009

parsertng.png

Last week I released a proof-of-concept demo of the next generation Ubiquity parser design and it was also the focus of discussion in our weekly internationalization meeting.1 Christian Sonne even wrote a Danish plugin for it during the meeting—a testament to the pluggability and of the new parser design.

In addition, at the Ubiquity weekly meeting, pushing this new parser into Ubiquity proper was identified as a key goal of Ubiquity 0.2, making frequent iteration and debate over this parser essential.

To that end, I’ll highlight some of the changes made to the parser demo codebase in the past week: (more…)


  1. The weekly internationalization meeting, like all Ubiquity weekly meetings, are completely open to the public. We’d love to hear new voices contribute to the discussion! Take a look at the schedule of upcoming meetings


© 2006-2010 mitcho (Michael 芳貴 Erlewine).
Proudly powered by WordPress.
Entries (RSS) and Comments (RSS).
The views expressed on these pages are mine alone and do not
reflect those of my employers and clients, past and present.