blog

Ubiquity in Italian!

Thanks to the great work of Sandro Della Giustina, we now have a preliminary Italian parser for use with Ubiquity Parser 2. Sandro brought up a good point, however, about Italian prepositions which contract with the article and the head noun. For example,

traduci   dall'inglese     al     cinese
translate from=the=English to=the Chinese

One current solution is to add zero-width spaces after these contracted articles, all’ and dall’.1 The appropriate way to add this to the parser is by defining a custom wordBreaker() method.

it._patternCache.contractionMatcher = new RegExp('(^| )(all\'|dall\')','g');
it.wordBreaker = function(input) {
  return input.replace(this._patternCache.contractionMatcher,'$1$2\u200b');
};

Grazie Sandro!


  1. As John Daggett pointed out to me, in the future we may have to add an intermediate shallow parse instead of adding characters (in this case, the zero-width space) to the modified input. 

Related posts:

  1. Ubiquity in Italian
  2. Adding Your Language to Ubiquity Parser 2
  3. Solving a Romantic Problem: Portmanteau’ed Prepositions
  4. Ubiquity Commands by The Numbers
  5. Changes to Ubiquity Parser 2 and the Playpen

Related posts brought to you by Yet Another Related Posts Plugin.

Tags: , , , ,

If you enjoyed this post, make sure you subscribe to my RSS feed (optionally with tweets from my Twitter)!

One Response to “Ubiquity in Italian!”

  1. gialloporpora Says:

    Thank you Michael for your support and your hard work in Ubiquity localizzation.
    I have only modified the ca.js file, I hope to improove the plugin in future :-)

    I am trying to understand how work the parser reading your documentation and the code, I hope to improove the plugin soon. Ciao Sandro


© 2006-2010 mitcho (Michael 芳貴 Erlewine).
Proudly powered by WordPress.
Entries (RSS) and Comments (RSS).
The views expressed on these pages are mine alone and do not
reflect those of my employers and clients, past and present.