I recently have begun giving serious thought to what command chaining might look like in Ubiquity and the various considerations which must be made to make it happen. The “command chaining,” or “piping,” described here always involves (at least) two verbs acting sequentially on a passed target—that is, the first command performs some action or lookup and the second command acts on the first command’s output.
As Ubiquity 0.5 will be released soon (Thursday morning in Mountain View), I decided it was a good time to put together a screencast in Japanese demoing the use of the new Japanese parser and commands.
This past Monday I presented at Tokyo 2.0, Japan’s largest bilingual web/tech community. I presented as part of a session on The Web and Language, which I also helped organize. Other presenters included Junji Tomita from goo Labs, Shinjyou Sunao of Knowledge Creation, developers of the Voice Delivery System API, and Chris Salzberg of Global Voices Online on community translation.
I just put together a video of my Ubiquity presentation, mixing the audio recorded live at the presentation together with a screencast of my slides for better visibility. The presentation is 10 minutes long and is bilingual, English and Japanese.
Yesterday I was invited to give a lecture for students the MEXTIT Specialist Program. ITSP is a partnership between Keio, Waseda, and Chuo Universities and NTT, IBM, and Mozilla to bring advanced IT training and opportunities to their Master’s students. It was a longish time slot so I decided to split it up into two different talks: one on open source and open processes (similar to one of my sessions at the recent BarCamp Tokyo) and one on the future of interfaces, internationalization and globalization, and Ubiquity. Here are the slides for posterity. (Note: the second set of slides is mostly in Japanese.)
Every day on the way to work I walk by a fine establishment known as Yoshinoya (吉野家), Japan’s largest gyudon (牛丼) chain restaurant. For those of you whose lives have yet to be graced by gyudon, it’s a bowl of rice topped with beef and onions stewed in a sweet-savory soy-based sauce. Loving gyudon and being a cheapskate, I naturally noticed the recent 50 yen off gyudon promotion at Yoshinoya. The above photo is a photo of part of that sign.
Part of this sign, though, made me think about our new Ubiquity parser. In particular, it was the attachment ambiguity in the end date of the promotion. The text in the photo above literally is “April 15th (Wed.) 8PM until”. (Note that Japanese is a strongly head-final language, and that the “until” is a postposition.) There are two possible readings for this expression, as illustrated by the two composition trees below.
桜 (sakura) is Japanese for cherry blossom, an important symbol of spring time in Japan and, with it, a symbol of renewal. The cherry blossom is a beautiful fluffy and light flower which falls quickly off the tree with wind and rain, making it also an important representation of 物の哀れ (mono no aware).
Last weekend my family (including my aunt Mikako and Bailey) took a short trip to Yugawara (湯河原) at the base of the Izu peninsula. Last weekend was possibly the peak of the cherry blossoms this year, making it a very picturesque trip. It’s quite rare for the four of us to all be in the same place at the same time, so these photos are definite keepers:
One of my personal highlights was going down a slide at Azumayama Park in Ninomiya right through a grove of cherry trees in full bloom—it was so beautiful that I had to go back down it again and take a video! Unfortunately the Flash video encoding (or my camera) doesn’t do it justice, but I hope you can fill in the gaps with your imagination.
Yesterday I presented on Ubiquity internationalization and the new parser design at the Mozilla Extension Development Meeting (Japanese), a community event organized by some extension developers in Japan. There were a couple other Ubiquity-related “lightning talks” as well, so I’ll summarize some of the interesting ideas from those talks below.
To that end, I’ll highlight some of the changes made to the parser demo codebase in the past week:
(more…)
The weekly internationalization meeting, like all Ubiquity weekly meetings, are completely open to the public. We’d love to hear new voices contribute to the discussion! Take a look at the schedule of upcoming meetings. ↩
A few weeks ago I made some visual mockups of how Ubiquity could look and act in Japanese. Part of this proposal was what I called “particle identification”: that is, immediate in-line identification of delimiters of arguments, which can be overridden by the user:
The inspiration for this idea came from Aza’s blog post “Solving the ‘it’ problem” which advocates for this type of quick feedback to the user in cases of ambiguity. Such a method would help both the user better understand what is being interpreted by the system, as well as offer an opportunity for the user to correct improper parses. I just tried mocking up such an input box using jQuery.
One of the eventual goals of the Ubiquity project is to bring some of its functionality and ideas to Firefox proper. To this end, Aza has been exploring some possible options for what that would look like (round 1, round 2). All of his mockups, however, use English examples. I’m going to start exploring what Ubiquity in Firefox might look like in different kinds of languages. Let’s kick this off with my mother tongue, Japanese.1
UPDATE: Contribute information on how your language identifies its arguments here.
When we execute a command in Ubiquity, in very simple terms, we’re hoping to do something (a verb) to some arguments (the nouns). Every sentence in every language uses some method to encode which arguments correspond to which roles of the verb. Here are a couple examples:
1
2
He sees Mary.
彼が Maryを 見る。 (Kare-ga Mary-o miru.)
As speakers of English, you can read sentence (1) above and know exactly who is doing the seeing and who is being seen and speakers of Japanese can get the same information from (2). How do different languages code for arguments in different roles? There are, broadly speaking, three different ways:
I recently have been playing a fair deal of RISK on the web with some friends.1RISK, for those who don’t know, is a wonderful world domination strategy board game.
My friends and I use a site called warfish.net which lets you set up games with your friends and play sans Flash. You don’t need to play in real time, either… warfish will email you when it’s your turn, making it a great way to play with friends halfway around the world. The site is invite-only, but you can request an invite from me here.
About a week ago I tried playing from my iPhone while on the train and it worked remarkably well. The addition of a proper <meta name='viewport'> tag so I don’t have to zoom in with every reload would be even better, but I really can’t complain. This weekend I was playing on the way to and during breaks at my spacetime workshop as well.
Here’s a quick video I put together to show how it’s done on the iPhone:
Hope to play with you soon!
A little マイブーム (mai boomu, lit. “my boom”, another wasei-eigo roughly meaning a “personal fad”), you might say. ↩
Bailey just asked me what the difference between 回収 (kaishū) and 収集(shūshū) is—two words that would both map to the English verb “collect.” I intuitively came up with a hypothesis to explain the distinction:
回収 may take things away from others when collecting while 収集 does not have that implication.
Things that you 回収 may have been previously distributed by the actor themself while 収集 does not have that implication.1
Not content with armchair theorizing, however, I decided to take advantage of one of the largest corpora in the world: Google.2 To test my hypothesis, I chose two “objects of collection”, one you can take away (and often is distributed first) and one you can’t take away: アンケート (ankēto “survey,” from the French enquête) and 意見 (iken “opinion”). I then took the four resulting collocations3 on Google in quotes (“•”) and recorded how many hits there were.
This second point could also be hypothesized based on the component meaning of 回, which in the verb 回る (mawa=ru) can mean “circle back.” ↩
Google is of course a huge corpus but it has very limited search and can easily be misused and misunderstood, thus making Google an unreliable (unprofessional) source for statistical data. One Google alternative for some different statistics is the n-gramdata they offer for research. ↩
”Collocation” on Wikipedia says: “Within the area of corpus linguistics, collocation is defined as a sequence of words or terms which co-occur more often than would be expected by chance.” ↩
I’ve written before about Mailplane, a high-quality Gmail client with some great Mac-specific features. I’ve been happy to be associated with the project as its Japanese localizer. I recently completed the localization for the upcoming version 2.0. As a result, I’ve received twenty free licenses for Mailplane 2.0 from the developer, Ruben Bakker. Email me if you’re interested in one, and keep your eyes peeled for the 2.0 gold release.