blog

Ubiquity in Italian!

Thanks to the great work of Sandro Della Giustina, we now have a preliminary Italian parser for use with Ubiquity Parser 2. Sandro brought up a good point, however, about Italian prepositions which contract with the article and the head noun. For example,

traduci   dall'inglese     al     cinese
translate from=the=English to=the Chinese

One current solution is to add zero-width spaces after these contracted articles, all’ and dall’.1 The appropriate way to add this to the parser is by defining a custom wordBreaker() method.

it._patternCache.contractionMatcher = new RegExp('(^| )(all\'|dall\')','g');
it.wordBreaker = function(input) {
  return input.replace(this._patternCache.contractionMatcher,'$1$2\u200b');
};

Grazie Sandro!


  1. As John Daggett pointed out to me, in the future we may have to add an intermediate shallow parse instead of adding characters (in this case, the zero-width space) to the modified input. 

Related posts:

  1. Ubiquity in Italian
  2. Adding Your Language to Ubiquity Parser 2
  3. Solving a Romantic Problem: Portmanteau’ed Prepositions
  4. Judging Noun Types
  5. Ubiquity Commands by The Numbers

Related posts brought to you by Yet Another Related Posts Plugin.

Tags: , , , ,

If you enjoyed this post, make sure you subscribe to my RSS feed (optionally with tweets from my Twitter)!

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.

One Response to “Ubiquity in Italian!”

  1. gialloporpora Says:

    Thank you Michael for your support and your hard work in Ubiquity localizzation.
    I have only modified the ca.js file, I hope to improove the plugin in future :-)

    I am trying to understand how work the parser reading your documentation and the code, I hope to improove the plugin soon. Ciao Sandro


© 2006-2008 mitcho (Michael 芳貴 Erlewine).
Proudly powered by WordPress.
Entries (RSS) and Comments (RSS).
The views expressed on these pages are mine alone and do not
reflect those of my employers and clients, past and present.