Recent work in the Ubiquity internationalization realm has focused on the upcoming Ubiquity parser which will bring some great new features to Ubiquity, including support for overlord verbs and semi-automatic localization of commands via semantic roles. It’s possible, though, that these new features will break backwards compatibility of the current command specification and noun types. [[Creative destruction]] for the win.
As we look to move forward with incorporating the next generation parser into Ubiquity proper, it thus becomes important to take a look at the current command ecosystem to see how possibly disruptive this move will be. To this end last night I wrote a quick perl script to scrape the commands cached on the herd and get some quantitative answers to my questions.
(1577 different verbs were analyzed. None of these computations below are weighted by feed popularity.)
Q: Are there a lot of commands which use more than one argument?
A: The vast majority (>85%) of commands take one or no arguments, requiring no modifiers. Only those remaining 15% will require a switch to refer to different arguments by semantic role.
Q: Do many commands introduce custom noun types?
A: 147 different noun types (lumping anonymous inline objects as one type) were detected. The vast majority of all
takes (direct object) arguments were of type
noun_arb_text, although many
modifiers arguments used custom noun types. The other standard (built-in) noun types are well represented as well, with
noun_type_language coming in at second place. Here’s a chart with all the noun types which had more than one use.
Q: Are commands with
modifiers using natural-language delimiters?
A: Most of the modifiers detected were English prepositions such as “from”, “to”, “as”, “with”, but other words were also seen such as “title”, “type”, “username”, and “message” and even a handful of commands with symbols such as “@”, “>”, or “#”.