mitcho Michael 芳貴 Erlewine

Postdoctoral fellow, McGill Linguistics.


Posts Tagged ‘Japanese language’

Exploring Command Chaining in Ubiquity: Part 2

Sunday, August 23rd, 2009


I recently have begun giving serious thought to what command chaining might look like in Ubiquity and the various considerations which must be made to make it happen. The “command chaining,” or “piping,” described here always involves (at least) two verbs acting sequentially on a passed target—that is, the first command performs some action or lookup and the second command acts on the first command’s output.

A few days ago I penned some initial technical considerations regarding command chaining. In this post I’ll be point out some linguistic considerations involved in supporting a natural syntax for chaining.


日本語サポートを含む Ubiquity 0.5 リリース

Friday, July 10th, 2009

Mozilla Japan ブログUbiquity を紹介する投稿を上げたので、ここでもクロスポストします。 Here’s a cross-post of a Ubiquity 0.5 announcement (in particular regarding the new Japanese support) I wrote for the Mozilla Japan blog.

Mozilla Labs の実験的プロジェクトのひとつ、 Ubiquity の最新版、バージョン 0.5 を昨日リリースしました。 (Mozilla Labs 正式発表 [英文])

Ubiquity はウェブをより有益に、より使いやすくするために自然言語で Firefox を操作するインターフェースを提供します。ウェブ上のオープン API と機能が増えて行く一方でどのようなインターフェースが必要であるのか。その答えを追求した結果、テキスト入力の正確さとスピードと自然言語の心地よさを合わせたインターフェースができあがりました。例えば「麹町を地図で表示」、「これを (誰々) へメール」などを自分の言葉で入力してブラウザを操作することができます。新しいコマンド (動詞) も簡単に JavaScript で書けるので、拡張性も非常に高いプラットフォームです。

ユーザにとって「自然な構文」 (“natural syntax” [英文]) という目標の下、数ヶ月の研究の結果、Ubiquity 0.5 では複数の言語の異なる構文に対応できるパーサを実装しました。Ubiquity 内蔵のコマンドもローカライズ可能になり、0.5 ではすべての内蔵コマンドの日本語、デンマーク語とポルトガル語版が搭載されています。

リリース直前に Ubiquity の日本語紹介ビデオを作成しましたので、どうぞご覧ください。日本語モードでの使用方法も説明されています。

Ubiquity 0.5 日本語紹介ビデオ from mitcho on Vimeo.

日本語サポートが入った Ubiquity 0.5 を是非ご使用ください。このインターフェースをより多くのユーザが「自然に」使えるよう、これからも開発を続けていきたいと思います。

Ubiquity 0.5 日本語紹介ビデオ

Thursday, July 2nd, 2009

今夜リリースされる Ubiquity の最新版、0.5 に備えて日本語で Ubiquity のスクリーンキャストを作ってみました。 Ubiquity 0.5 は特に多言語化を重視したリリースで、 Ubiquity 内蔵のコマンドが日本語とデンマーク語で使えるようになっています。是非インストールしてみてください!

追伸: 7月3日現在、 Ubiquity 0.5 のリリースを遅らせる方向になったので、残念ながら今日はリリースされません。是非リリース後インストールしてみてください。

Ubiquity 0.5 日本語紹介ビデオ from mitcho on Vimeo.

As Ubiquity 0.5 will be released soon (Thursday morning in Mountain View), I decided it was a good time to put together a screencast in Japanese demoing the use of the new Japanese parser and commands.

Ubiquity presentation at Tokyo 2.0

Wednesday, June 10th, 2009


This past Monday I presented at Tokyo 2.0, Japan’s largest bilingual web/tech community. I presented as part of a session on The Web and Language, which I also helped organize. Other presenters included Junji Tomita from goo Labs, Shinjyou Sunao of Knowledge Creation, developers of the Voice Delivery System API, and Chris Salzberg of Global Voices Online on community translation.

I just put together a video of my Ubiquity presentation, mixing the audio recorded live at the presentation together with a screencast of my slides for better visibility. The presentation is 10 minutes long and is bilingual, English and Japanese.

Ubiquity: Command the Web with Language 言葉で操作する Web from mitcho on Vimeo.


Lecture at ITSP - 先端ITスペシャリスト育成プログラムにて講義

Thursday, June 4th, 2009

Yesterday I was invited to give a lecture for students the [[MEXT]] IT Specialist Program. ITSP is a partnership between Keio, Waseda, and Chuo Universities and NTT, IBM, and Mozilla to bring advanced IT training and opportunities to their Master’s students. It was a longish time slot so I decided to split it up into two different talks: one on open source and open processes (similar to one of my sessions at the recent BarCamp Tokyo) and one on the future of interfaces, internationalization and globalization, and Ubiquity. Here are the slides for posterity. (Note: the second set of slides is mostly in Japanese.)


Design processes in the open-source era オープンソース時代のデザインプロセス

Ubiquity: Interfaces and Internationalization インターフェースと国際化

Attachment Ambiguity—or—when is the gyudon cheap?

Wednesday, April 15th, 2009


Every day on the way to work I walk by a fine establishment known as [[Yoshinoya]] (吉野家), Japan’s largest gyudon (牛丼) chain restaurant. For those of you whose lives have yet to be graced by [[gyudon]], it’s a bowl of rice topped with beef and onions stewed in a sweet-savory soy-based sauce. Loving gyudon and being a cheapskate, I naturally noticed the recent 50 yen off gyudon promotion at Yoshinoya. The above photo is a photo of part of that sign.

Part of this sign, though, made me think about our new Ubiquity parser. In particular, it was the attachment ambiguity in the end date of the promotion. The text in the photo above literally is “April 15th (Wed.) 8PM until”. (Note that Japanese is a strongly head-final language, and that the “until” is a postposition.) There are two possible readings for this expression, as illustrated by the two [[principle of compositionality|composition]] trees below.


Friday, April 10th, 2009

桜 (sakura) is Japanese for cherry blossom, an important symbol of spring time in Japan and, with it, a symbol of renewal. The cherry blossom is a beautiful fluffy and light flower which falls quickly off the tree with wind and rain, making it also an important representation of [[mono no aware|物の哀れ (mono no aware)]].

Last weekend my family (including my aunt Mikako and Bailey) took a short trip to Yugawara (湯河原) at the base of the [[Izu peninsula]]. Last weekend was possibly the peak of the cherry blossoms this year, making it a very picturesque trip. It’s quite rare for the four of us to all be in the same place at the same time, so these photos are definite keepers:

One of my personal highlights was going down a slide at Azumayama Park in [[Ninomiya]] right through a grove of cherry trees in full bloom—it was so beautiful that I had to go back down it again and take a video! Unfortunately the Flash video encoding (or my camera) doesn’t do it justice, but I hope you can fill in the gaps with your imagination.

Cherry blossom slide - 桜のすべりだい(二宮吾妻山公園) from mitcho on Vimeo.

Talking Ubiquity in Japan: 拡張機能勉強会にて発表

Monday, March 30th, 2009

Yesterday I presented on Ubiquity internationalization and the new parser design at the Mozilla Extension Development Meeting (Japanese), a community event organized by some extension developers in Japan. There were a couple other Ubiquity-related “lightning talks” as well, so I’ll summarize some of the interesting ideas from those talks below.

昨日第11回Mozilla拡張機能勉強会で Ubiquity の国際化と次世代パーサについて発表してきました。色々鋭いコメントをいただき、僕も良い勉強になりました。^^ スライドの方をslideshareに載せたので、是非参考にまた見てみてください。ライトニングトークでも Ubiquity の話で盛り上がったので、そのLTの内容で特に面白いと僕が思ったものを下に英語でちょっとまとめてみます。


This week on Ubiquity Parser: The Next Generation

Friday, March 27th, 2009


Last week I released a proof-of-concept demo of the next generation Ubiquity parser design and it was also the focus of discussion in our weekly internationalization meeting.1 Christian Sonne even wrote a Danish plugin for it during the meeting—a testament to the pluggability and of the new parser design.

In addition, at the Ubiquity weekly meeting, pushing this new parser into Ubiquity proper was identified as a key goal of Ubiquity 0.2, making frequent iteration and debate over this parser essential.

To that end, I’ll highlight some of the changes made to the parser demo codebase in the past week: (more…)

  1. The weekly internationalization meeting, like all Ubiquity weekly meetings, are completely open to the public. We’d love to hear new voices contribute to the discussion! Take a look at the schedule of upcoming meetings

User-Aided Disambiguation: a demo

Saturday, March 14th, 2009

A few weeks ago I made some visual mockups of how Ubiquity could look and act in Japanese. Part of this proposal was what I called “particle identification”: that is, immediate in-line identification of delimiters of arguments, which can be overridden by the user:

The inspiration for this idea came from Aza’s blog post “Solving the ‘it’ problem” which advocates for this type of quick feedback to the user in cases of ambiguity. Such a method would help both the user better understand what is being interpreted by the system, as well as offer an opportunity for the user to correct improper parses. I just tried mocking up such an input box using jQuery.

Try the User-Aided Disambiguation Demo

If you have any bugfixes to submit or want to play around with your own copy, the demo code is up on BitBucket. ^^ Let me know what you think!

Ubiquity in Firefox: Focus on Japanese

Friday, February 20th, 2009

One of the eventual goals of the Ubiquity project is to bring some of its functionality and ideas to Firefox proper. To this end, Aza has been exploring some possible options for what that would look like (round 1, round 2). All of his mockups, however, use English examples. I’m going to start exploring what Ubiquity in Firefox might look like in different kinds of languages. Let’s kick this off with my mother tongue, Japanese.1

今後多様な言語に対応したFirefox内のUbiquityを検討していきますが、その中でも今日は日本語をとりあげます。後日日本語で同じ内容を投稿するつもりです。^^ 日本語でのコメントも大歓迎です!


Three ways to argue over arguments

Wednesday, February 18th, 2009

UPDATE: Contribute information on how your language identifies its arguments here.

When we execute a command in Ubiquity, in very simple terms, we’re hoping to do something (a verb) to some arguments (the nouns). Every sentence in every language uses some method to encode which arguments correspond to which roles of the verb. Here are a couple examples:

  1. He sees Mary.
  2. 彼が Maryを 見る。 (Kare-ga Mary-o miru.)

As speakers of English, you can read sentence (1) above and know exactly who is doing the seeing and who is being seen and speakers of Japanese can get the same information from (2). How do different languages code for arguments in different roles? There are, broadly speaking, three different ways:

three ways to code for arguments in different roles

We’ll take a brief look today at these three different strategies, all of which a localizeable natural language interface will surely encounter.


RISK on the iPhone

Monday, November 10th, 2008

I recently have been playing a fair deal of RISK on the web with some friends.1 [[Risk (board game)|RISK]], for those who don’t know, is a wonderful world domination strategy board game.

RIsky business
Creative Commons License photo credit: kazamatsuri

My friends and I use a site called which lets you set up games with your friends and play sans Flash. You don’t need to play in real time, either… warfish will email you when it’s your turn, making it a great way to play with friends halfway around the world. The site is invite-only, but you can request an invite from me here.

About a week ago I tried playing from my iPhone while on the train and it worked remarkably well. The addition of a proper <meta name='viewport'> tag so I don’t have to zoom in with every reload would be even better, but I really can’t complain. This weekend I was playing on the way to and during breaks at my spacetime workshop as well.

Here’s a quick video I put together to show how it’s done on the iPhone:

Hope to play with you soon!

  1. A little マイブーム (mai boomu, lit. “my boom”, another [[wasei-eigo]] roughly meaning a “personal fad”), you might say. 

回収 vs. 収集 and Better Word Meanings Through Usage

Thursday, September 18th, 2008

Bailey just asked me what the difference between 回収 (kaishū) and 収集(shūshū) is—two words that would both map to the English verb “collect.” I intuitively came up with a hypothesis to explain the distinction:

  • 回収 may take things away from others when collecting while 収集 does not have that implication.
  • Things that you 回収 may have been previously distributed by the actor themself while 収集 does not have that implication.1

Not content with armchair theorizing, however, I decided to take advantage of one of the largest corpora in the world: [[Google]].2 To test my hypothesis, I chose two “objects of collection”, one you can take away (and often is distributed first) and one you can’t take away: アンケート (ankēto “survey,” from the French enquête) and 意見 (iken “opinion”). I then took the four resulting collocations3 on Google in quotes (“•”) and recorded how many hits there were.


  1. This second point could also be hypothesized based on the component meaning of 回, which in the verb 回る (mawa=ru) can mean “circle back.” 

  2. Google is of course a huge corpus but it has very limited search and can easily be misused and misunderstood, thus making Google an unreliable (unprofessional) source for statistical data. One Google alternative for some different statistics is the [[n-gram]] data they offer for research. 

  3. [[collocation|”Collocation” on Wikipedia]] says: “Within the area of corpus linguistics, collocation is defined as a sequence of words or terms which co-occur more often than would be expected by chance.” 

Free licenses for Mailplane 2.0—Mailplane 2.0 の無料ライセンズ

Tuesday, August 19th, 2008

I’ve written before about [[Mailplane]], a high-quality Gmail client with some great Mac-specific features. I’ve been happy to be associated with the project as its Japanese localizer. I recently completed the localization for the upcoming version 2.0. As a result, I’ve received twenty free licenses for Mailplane 2.0 from the developer, Ruben Bakker. Email me if you’re interested in one, and keep your eyes peeled for the 2.0 gold release.

前にもここで話題にしたことはあるが、今日は [[Mailplane]] の新バージョンを発表しよう。 Mailplane は Mac 的な機能満載の Gmail クライアントで、Gmail 2.0 対応の最新バージョン (2.0) が近々リリースされる。 自分は Mailplane の日本語版担当なので開発者のルーベン・バッカーさんから Mailplane 2.0 の無料ライセンズを20件頂きました。欲しい方はこちらにメールしてください。

尚、日本ではもうすぐ Mailplane が MacFan で紹介されるとのこと。楽しみ!最後に、日本語版で問題があると思ったら、勝手に書き上げる前に直接教えてね。^^ お願いします。m(__)m