Ubiquity in Italian!

Thanks to the great work of Sandro Della Giustina, we now have a preliminary Italian parser for use with Ubiquity Parser 2. Sandro brought up a good point, however, about Italian prepositions which contract with the article and the head noun. For example,

traduci   dall'inglese     al     cinese
translate from=the=English to=the Chinese
One current solution is to add [[zero-width space zero-width spaces]] after these contracted articles, all’ and dall’.1 The appropriate way to add this to the parser is by defining a custom wordBreaker() method.
it._patternCache.contractionMatcher = new RegExp('(^| )(all\'|dall\')','g');
it.wordBreaker = function(input) {
  return input.replace(this._patternCache.contractionMatcher,'$1$2\u200b');
};

Grazie Sandro!

  1. As John Daggett pointed out to me, in the future we may have to add an intermediate shallow parse instead of adding characters (in this case, the zero-width space) to the modified input.