Scoring and Ranking Suggestions
I just spent some time reviewing how Ubiquity currently ranks its suggestions in relation to to Parser The Next Generation so I thought I’d put some of these thoughts down in writing.
The issue of ranking Ubiquity suggestions can be restated as predicting an optimal output given a certain input and various conflicting considerations. Ubiquity (1.8, as of this writing) computes four “scores” for each suggestion:
duplicateDefaultMatchScore
: 100 by default—lowered if an unused argument gets multiple suggestions (in the words of the code: “reduce the match score so that multiple entries with the same verb are only shown if there are no other verbs.”)frequencyMatchScore
: a score from thesuggestion memory
of the frequency of the suggestion’s verb, given the input verb (currently the first word) or nothing, in the case of noun-first suggestionsverbMatchScore
: float in [0,1]: (as described here)- 0.75 is returned in case there it is a noun-first suggestion (by virtue of the fact that
String.indexOf('')==0
) - 1 if the verb name is equivalent across input-output
- in [0.75,1) if the input is a prefix of the suggestion verb name
- in [0.5,0.75) if the input is a non-prefix substring of the suggestion verb
- in [0.25,0.5] if the input is a prefix of one of the
synonyms
- in [0,0.25) if the input is a non-prefix substring of one of the
synonyms
- 0.75 is returned in case there it is a noun-first suggestion (by virtue of the fact that
argMatchScore
: the number of arguments with matching “specific” nountypes, where “specific” is designated by the nountype having propertyrankLast=false
.
With the numeric scores for each of these criteria, a partial order of suggestions is constructed using a [[lexicographic order]]: that is, compare candidates first using duplicateDefaultMatchScore
, break ties using frequencyMatchScore
, if still tied break using verbMatchScore
, and if still tied break using argMatchScore
. This paradigm of constraints is called “strictly ranked” and a corollary of this is that lower constraints, no matter how well you score on them, can never overcome a loss at a higher constraint. A crucial corollary of this system is that lower constraints’ scores need not be computed if a higher constraint already dooms it to a lower position.1
Ranking in The Next Generation
One of the goals of Parser The Next Generation is to make noun/argument-first input first-class citizens of Ubiquity, improving their suggestions in particular to the benefit of verb-final languages. Arguments will be split up and tested against different noun types before a verb is even entered into the input, in which case target verbs can be ranked according to the appropriateness of the input’s arguments. As such, I believe the argMatchScore
criteria above should either be ranked higher in a strictly ranked model or be allowed to overtake lower scores for the higher constraints in a non-strictly ranked model.
The Parser The Next Generation proposal and demo currently orders using a product of various criteria’s scores, rather than a lexicographic order of strictly ranked constraints. The component factors are:
0.5
for parses where the verb was suggested0.5
for each extra (>1)object
argument (essentially “unused words” in the previous parser)- the score of each argument against that semantic role’s target noun type
0.8
for each unset argument of that verb
Each component score is a value in [0,1], so the score is always non-decreasing across the derivation. This offers a natural way to optimize the candidate set creation: if a possible parse ever gets a score below a magic “threshold” value, it is immediately thrown away.
A possible problem with the current Parser TNG scoring model is that it will implicitly hinder verbs and parses with more arguments as it could have more sub-1 noun type score factors—this consideration may be great enough that a weighted additive model should be considered over a multiplicative one.
How do you think we can make Ubiquity’s suggestion ranking smarter? What other factors should be considered, and what factors could be left out?
-
For all the linguists in the audience, if this sounds like [[Optimality Theory]], you would be right—there’s a little bit of Prince and Smolensky (1993) hanging out in your browser. ↩