Ubiquity in Italian!
Thanks to the great work of Sandro Della Giustina, we now have a preliminary Italian parser for use with Ubiquity Parser 2. Sandro brought up a good point, however, about Italian prepositions which contract with the article and the head noun. For example,
traduci dall'inglese al cinese translate from=the=English to=the Chinese
One current solution is to add zero-width spaces after these contracted articles, all’ and dall’.1 The appropriate way to add this to the parser is by defining a custom wordBreaker() method.
it._patternCache.contractionMatcher = new RegExp('(^| )(all\'|dall\')','g'); it.wordBreaker = function(input) { return input.replace(this._patternCache.contractionMatcher,'$1$2\u200b'); };
Grazie Sandro!
-
As John Daggett pointed out to me, in the future we may have to add an intermediate shallow parse instead of adding characters (in this case, the zero-width space) to the modified input. ↩
Related posts:
- Ubiquity in Italian
- Adding Your Language to Ubiquity Parser 2
- Solving a Romantic Problem: Portmanteau’ed Prepositions
- Ubiquity Commands by The Numbers
- Changes to Ubiquity Parser 2 and the Playpen
Related posts brought to you by Yet Another Related Posts Plugin.
Tags: code, Italian, Mozilla Planet, parser, ubiquity
If you enjoyed this post, make sure you subscribe to my RSS feed (optionally with tweets from my Twitter)!
May 20th, 2009 at 12:30 am
Thank you Michael for your support and your hard work in Ubiquity localizzation.
I have only modified the ca.js file, I hope to improove the plugin in future
I am trying to understand how work the parser reading your documentation and the code, I hope to improove the plugin soon. Ciao Sandro