blog

Posts Tagged ‘code’

Start Panic!

Monday, May 11th, 2009

Just saw this nice demo of the a:visited browser history security issue in action. Visit startpanic.com and click “start” to see it in action.

Picture 2.png

Read about how this security hole works here. Hopefully flashy demos like this will bring more attention to this issue.

Adding Your Language to Ubiquity Parser 2

Wednesday, April 29th, 2009

NOTE: This blog post has now been added to the Ubiquity wiki and is updated there. Please disregard this article and instead follow these instructions.

You’ve seen the video. You speak another language. And you’re wondering, “how hard is it to add my language to Ubiquity with Parser 2?” The answer: not that hard. With a little bit of JavaScript and knowledge of and interest in your own language, you’ll be able to get at least rudimentary Ubiquity functionality in your language. Follow along in this step by step guide and please submit your (even incomplete) language files!

As Ubiquity Parser 2 evolves, there is a chance that this specification will change in the future. Keep abreast of such changes on the Ubiquity Planet and/or this blog (RSS).

(more…)

Command Chaining with Oni?

Wednesday, April 29th, 2009

There are two challenges to implementing so-called command chaining, but only one of them is choosing a linguistically appropriate structural standard and parsing it. The other is the underlying difficulty of processing each individual “clause” in sequence, asynchronously. Alex Fritze blogged about how a project like his own Oni could dramatically simplify this underlying process.

Ubiquity, Oni, and Composability:

but I cannot instruct it to give me list of translated google results:
translate (google foo) to German  // doesn't work
Or email me the resulting list:
email(translate (google foo) to German) // doesn't work
…So how does Oni relate to this? Oni is a browser-based “embedded structured concurrency framework”. It allows you to write asynchronous code as if it was synchronous, adding back the kind-of composibility that is lost when juggling concurrent strands of execution (such as e.g. pending XMLHttpRequests) with ‘conventional’ sequential languages.

Scoring for Optimization

Friday, April 24th, 2009

Suppose you have a number of competing candidates, each of which can be ranked with a score, but it takes a little time to calculate each candidate’s score. You’re only interested in the top n candidates. You want to come up with a scoring scheme where you can throw the extra candidates out of consideration earlier without sacrificing quality. Such is the problem of scoring and ranking suggestions in Ubiquity. What properties must such a scoring system have?

This blog post includes a lot of complex CSS-formatted graphs which may be best viewed in — what else? — Firefox. You may also want to access this blog post directly rather than through a planet.

candidate 8  
candidate 2  
candidate 9  
candidate 3  
candidate 10 CUTOFF
candidate 5 
candidate 1 
candidate 7 
  

One portion of the problem description above merits clarification: I define “without sacrificing quality” to mean that, if we did not throw out any candidates early and waited until all the scores are computed fully and accurately, we would still yield the same top n winners. This already gives us the key insight towards an appropriate solution: we can only throw out candidates when we know that it has no further chance of making it up into top n candidates.

(more…)

A Demonstration of Ubiquity Parser 2

Friday, April 24th, 2009

Here’s a quick demonstration of Ubiquity Parser 2, aka “the new parser.” I’ll show you how you can use the parser yourself and point out some highlights of the new functionality.


Ubiquity Parser 2: better noun-first suggestions and command localization from mitcho on Vimeo.

(more…)

Count command for Ubiquity

Monday, April 13th, 2009

(This is primarily a blog post to test out Sandro’s plugin for embedding Ubiquity commands in WordPress. If you don’t see the “subscribe to command” come up, make sure you’re looking at the single page view.)

A while back I created a count command for Ubiquity to count HTML elements on a page, so I’ll share it here. The idea is super simple: select some text on your page and execute count p to get the number of paragraphs, or count a to get the number of links, or count tr to get the number of table rows. This is super useful when reading articles with charts or lists online and you want to know how many there are without doing something like copy-pasting into Excel.

The count command is built using jQuery so it can even understand targets like p.class or a[href=...]. Give it a try! ^^

Rolling out the Roles

Thursday, April 9th, 2009

Jono and I have recently been working to incorporate the Parser The Next Generation into Ubiquity proper, and this of course involves the process of retooling the standard commands with semantic roles. The first step, however, is to come up with a list of universal semantic roles which the verbs will be rewritten to use and individual languages’ parsers will be built to identify. Today I have just such a proposal.

(more…)

Scoring and Ranking Suggestions

Tuesday, April 7th, 2009

I just spent some time reviewing how Ubiquity currently ranks its suggestions in relation to to Parser The Next Generation so I thought I’d put some of these thoughts down in writing.

The issue of ranking Ubiquity suggestions can be restated as predicting an optimal output given a certain input and various conflicting considerations. Ubiquity (1.8, as of this writing) computes four “scores” for each suggestion:

(more…)

Foxkeh demos Ubiquity Parser: The Next Generation

Wednesday, April 1st, 2009

I just made a screencast with Foxkeh to demo the Ubiquity next generation parser demo and to demonstrate how easy it is to add your own language. Foxkeh wants you to localize the parser into your language. How could you say no? ^^


Foxkeh demos Ubiquity Parser: The Next Generation from mitcho on Vimeo.

There are some details which are not covered in this introductory video, such as how to deal with case marking languages or languages without spaces. Hopefully this’ll inspire some people to play with the demo, though. I’d love to hear your comments! ^^

Ubiquity Commands by The Numbers

Wednesday, April 1st, 2009

Recent work in the Ubiquity internationalization realm has focused on the upcoming Ubiquity parser which will bring some great new features to Ubiquity, including support for overlord verbs and semi-automatic localization of commands via semantic roles. It’s possible, though, that these new features will break backwards compatibility of the current command specification and noun types. Creative destruction for the win.

As we look to move forward with incorporating the next generation parser into Ubiquity proper, it thus becomes important to take a look at the current command ecosystem to see how possibly disruptive this move will be. To this end last night I wrote a quick perl script to scrape the commands cached on the herd and get some quantitative answers to my questions.

(more…)

This week on Ubiquity Parser: The Next Generation

Friday, March 27th, 2009

parsertng.png

Last week I released a proof-of-concept demo of the next generation Ubiquity parser design and it was also the focus of discussion in our weekly internationalization meeting.1 Christian Sonne even wrote a Danish plugin for it during the meeting—a testament to the pluggability and of the new parser design.

In addition, at the Ubiquity weekly meeting, pushing this new parser into Ubiquity proper was identified as a key goal of Ubiquity 0.2, making frequent iteration and debate over this parser essential.

To that end, I’ll highlight some of the changes made to the parser demo codebase in the past week: (more…)


  1. The weekly internationalization meeting, like all Ubiquity weekly meetings, are completely open to the public. We’d love to hear new voices contribute to the discussion! Take a look at the schedule of upcoming meetings

Automating the Linguist’s Job

Tuesday, March 24th, 2009

At the end of my blog post yesterday I hinted at an exciting possible approach to Ubiquity’s localization:

In the future we ideally could build a web-based system to collect these “utterances.” We could … generate parser parameters based on those sentences. That would essentially reduce the parser-construction process to a more run-of-the-mill string translation process.

If we build this type of “command-bank” of common Ubiquity input translated into various languages, we could build a tool to learn various features of each language and generate each parser, essentially learning the language based on data. Today I’ll elaborate on how I believe this could be possible, by analogy to another language learning device: the human.

(more…)

Ubiquity Parser: The Next Generation Demo

Wednesday, March 18th, 2009

parserdesign

A week or two ago while visiting California, Jono and I had a productive charrette, resulting in a new architecture proposal for the Ubiquity parser, as laid out in Ubiquity Parser: The Next Generation. The new architecture is designed to support (1) the use of overlord verbs, (2) writing verbs by semantic roles, and (3) better suggestions for verb-final languages and other argument-first contexts. I’m happy to say that I’ve spent some time putting a proof-of-concept together.

I’ve implemented the basic algorithm of this parser for left-branching languages (like English) and also implemented some fake English verbs, noun types, and semantic roles. This demo should give you a basic sense of how this parser will attempt to identify different types of arguments and check their noun types even without clearly knowing the verb. This should make the suggestion ranking much smarter, particularly for verb-final contexts. (For a good example, try from Tokyo to San Francisco.)

➔ Check out the Ubiquity next-gen parser demo

(more…)

User-Aided Disambiguation: a demo

Saturday, March 14th, 2009

A few weeks ago I made some visual mockups of how Ubiquity could look and act in Japanese. Part of this proposal was what I called “particle identification”: that is, immediate in-line identification of delimiters of arguments, which can be overridden by the user:

The inspiration for this idea came from Aza’s blog post “Solving the ‘it’ problem” which advocates for this type of quick feedback to the user in cases of ambiguity. Such a method would help both the user better understand what is being interpreted by the system, as well as offer an opportunity for the user to correct improper parses. I just tried mocking up such an input box using jQuery.

Try the User-Aided Disambiguation Demo

If you have any bugfixes to submit or want to play around with your own copy, the demo code is up on BitBucket. ^^ Let me know what you think!

Writing commands with semantic roles

Tuesday, February 24th, 2009

Thank you to everyone who contributed data to how your language identifies its arguments! The data collection is ongoing so please contribute data points for languages you know!

How Ubiquity identifies its arguments

Currently when writing a command in Ubiquity you must specify two properties for each argument: a modifier (the appropriate adposition—the direct object excluded) and the noun type. Here are some quick examples from the standard commands:

email:

  • direct object (noun_arb_text)
  • to (noun_type_contact)

translate:

  • direct object (noun_arb_text)
  • to (noun_type_language)
  • from (noun_type_language)

This way of specifying arguments has a few shortcomings. First of all, it requires you to identify each type of argument by unique adposition, which does not support languages with case marking nor languages with sets of synonymous adpositions (e.g. French {à la, au, aux}). Second, as we saw in how your language identifies its arguments some languages don’t mark semantic roles on the arguments at all and the current system of specifying arguments is completely incompatible with these languages. Third, the current specification requires command authors to make localized versions of their commands, specifying the language-appropriate modifiers.

(more…)


© 2006–2011 mitcho (Michael 芳貴 Erlewine).
Proudly powered by WordPress on Media Temple.
The views expressed on these pages are mine alone and do not
reflect those of my employers and clients, past and present.