blog

Ubiquity in Firefox: Focus on Japanese

One of the eventual goals of the Ubiquity project is to bring some of its functionality and ideas to Firefox proper. To this end, Aza has been exploring some possible options for what that would look like (round 1, round 2). All of his mockups, however, use English examples. I’m going to start exploring what Ubiquity in Firefox might look like in different kinds of languages. Let’s kick this off with my mother tongue, Japanese.1

今後多様な言語に対応したFirefox内のUbiquityを検討していきますが、その中でも今日は日本語をとりあげます。後日日本語で同じ内容を投稿するつもりです。^^ 日本語でのコメントも大歓迎です!

What commands look like in Japanese

Japanese is not only just a verb-final language but it is strongly head-final, meaning it has postpositions instead of prepositions, direct objects come before verbs, and adjectives precede nouns. In terms of how it identifies its arguments, every argument has a postposition/case marker (called a particle in the Japanese literature) which marks its role in the sentence.

A couple common particles we’ll look at in this example include -を (-o) which marks the direct object (accusative case, you might say) and -に (-ni) which acts like English “to” (dative case). The example sentence we’ll look at today is:

ケーキをブレアに送って(ください)
kēki-oburea-niokuʔtekudasai
cake.ACCBlair.DATsend.IMP“please”
“Please send a cake to Blair.”

(Note: ʔ is a glottal stop. ACC=accusative, DAT=dative, and IMP=imperative form.)

That final ください is often dropped in very casual speech and, as it adds no new information, we’ll assume today that the user will not enter it. Finally, Japanese doesn’t use spaces in their orthography, so the actual input would be “ケーキをブレアに送って”.

Mockup 1: Particle identification

One of the major hurdles in working with Japanese is that there are no spaces between the words. The natural first step is to split the sentence up into words, but this is a very difficult problem in NLP which big name research groups actively work on.

Fortunately, however, in “Solving the ‘It’ Problem” Aza suggests that, when we encounter ambiguity in our input, we can go ask the user. Great minds think alike, and computer scientist Jean E. Sammet suggested the same idea way back in 1953:

Using English [or any other natural language] definitely involves the requirement for the computer (or more accurately its programming system) to query the user about any possible ambiguity.

Parsing a sentence into words, in the limited context of Ubiquity, is really about identifying the particles which mark the end of each argument. Here’s a mockup of an application of the Sammet-Raskin Method to this problem:

particle-id.png

Pros: This completely takes care of the word-breaking problem, with minimal arbitration from the user. The parser knows exactly what arguments it’s dealing with and the visual feedback means the user won’t be surprised by the parse.

Cons: Most of the particles/postpositions we’d have to deal with are a single character, so they may show up pretty often within words, in which case it would be quite annoying to have to press escape after each one.

An even smarter system, when wanting to mark a character as a particle, would first check to see that the argument (before the particle) is a valid argument type for that particle. If the check fails, it doesn’t have to bother with suggesting that character as a particle. This may cut down on the false positives.

Smart suggestions: what works, what doesn’t

One of the key suggestions in Aza’s mockups include a way to choose the prepositions while entering your arguments, based on the current verb.

For example, here, the translate command accepts a direct object, a to-object, and a from-object, so little to and from markers magically show up on the right side, making the appropriate prepositions (and by extension the appropriate arguments) discoverable. I think this line of thinking is a really good one, at least for English.

In a verb-final language, however, you enter the arguments first and then the verb, making this strategy of suggesting appropriate arguments impossible. Note that in the user-contributed spreadsheet of how languages identify their arguments we see that about a quarter of the languages we looked at are verb-final—that is, with Subject-Object-Verb canonical word order.

Instead of seeing this as a disadvantage, however, let’s see what verb-final order allows us to do.

Mockup 2: A different kind of suggestion

Not all verbs allow for every different kind of particle. For example, it doesn’t make sense to have a -に (-ni, “to” or dative) argument for a verb like 検索して (kensaku-shite, “search for”). In English we used this to suggest different types of arguments given a specific verb. In a verb-final language, we could do this backwards.

verb-suggestion.png

Pros: This makes verbs highly discoverable, given a certain argument structure. For example, if you enter a few arguments, like a direct object, a “to” argument, and a “from” argument, it’ll suggest verbs that will do something to an object from somewhere to somewhere else. This way, you can easily try out verbs you didn’t even know existed. It’ll only give you verbs appropriate for your arguments, reducing the chance of writing a an infelicitous command.

Cons: Without knowing what kinds of actions are available, it may be difficult to know what kinds of arguments to enter in the first place. If you have a specific verb or service you want to use it may be counterintuitive or downright tricky to start by guessing the right set of arguments.

In addition, from a technical point of view, this requires much of the prediction algorithms in English Ubiquity to run backwards. Ideally, there would be a closed (predetermined) class of particles and a predefined set of noun types. Verbs would not be able to define their own modifiers and noun classes as easily or freely as they can now.

Conclusion

The properties and challenges of Japanese grammar require that we not try to outright copy the English behavior but to think about what really makes sense in that language and that may be an important lesson as we move toward designing a localizable Ubiquity. Please post your questions and criticisms of this design or post your own mockups!

Related posts:

  1. Three ways to argue over arguments
  2. Ubiquity Parser: The Next Generation Demo
  3. Ubiquity in Portuguese
  4. Writing commands with semantic roles
  5. Ubiquity i18n: questions to ask

Related posts brought to you by Yet Another Related Posts Plugin.

Tags: , , , , , , , , , , ,

If you enjoyed this post, make sure you subscribe to my RSS feed (optionally with tweets from my Twitter)!

84 Responses to “Ubiquity in Firefox: Focus on Japanese”

  1. marsf Says:

    Hi, Japanese language has not sequenced order. So, I think the parser should be oriented with their particles.

    In addition, how do you think about ラーメン command? http://mozilla.l10n.jp/~mar/ubiquity/nakanoRamen_...

  2. mr.aleph Says:

    I looks like Japanese is indeed a very interesting laguage.
    Probably I should try to learn it =)

  3. Georg Says:

    In reading this I am asking myself, is it really that important to try to abide by another language's word order rules? Are we doing this for the sake of intuitive use only, i.e. designing the text input system as close to the use of the foreign language, hoping that a new user will be able to use the system right off the bat?

    Would it be so bad if Ubiquity didn't care about other language's word orders and imposed the order used in its native tongue - English?

    How difficult would it be for non-English users of Ubiquity to adapt to that imposed word order? Are we afraid we'd turn users away if we made them adapt?

    Sure it would be nice to provide a near-natural-language interface, but isn't this overkill?

    Look at programming languages: I may be ignorant, but I'm not aware of a country that has translated a programming language into their own native tongue. Is there a Japanese version of JavaScript, a French dialect of C?

    However, looking at the implications of the reverse word order paradigm that suddenly make verbs the (more) discoverable items instead of its arguments brings up an interesting point.

    If the arguments are entered first and the verb becomes a discoverable entity, that could make it much easier for new users who aren't familiar with the system's vocabulary yet. But, given one or two arguments I would think that the user would expect Ubiquity to prompt him with a list of applicable verbs based on the type of argument(s), not necessarily their declinations.

    For instance, if I were to first enter an address as an argument in a reverse order interface, Ubiquity should come up with the verbs/nouns "map" and perhaps "directions." This would be perfect if the user didn't know the correct Ubiquity word to map the address. Though, this poses a new problem, namely how to teach Ubiquity to interpret strings as addresses. Or how to teach Ubiquity to interpret strings as anything, since it would have to offer a set of applicable verbs based on a given argument.

    If this reverse word order model turned out to be indeed much more helpful for the newbie Ubiquity user, then it should be made available in the English and all other language versions as well.

    Of course, this would complicate things even more, not just from the developer's point of view.

    So, I am wondering if the interface should not be kept as is: simple and straight-forward across all languages.

  4. Georg Says:

    Oh, one other thing.

    Can you provide instructions on how to make Firefox display the Japanese characters correctly? Currently, I see squares filled with 4-digit hex numbers, which I believe represent a character's unicode?

    I'm using Firefox 3.0.6 on Windows.

  5. Keisuke Omi Says:

    Mockup 1 feels too open ended and looks like you'll have to support more varieties of sentence structures. For example if you can do 「ケーキをブレアに送る」then it's equally correct to do「ブレアにケーキを送る」

    I think Mockup 2 has the advantage of suggesting a command structure (structure makes it easy to be read by machine) disguised as autocomplete/suggestions (which is good for the user because it means less typing more opportunity for discovery).

    "Without knowing what kinds of actions are available, it may be difficult to know what kinds of arguments to enter in the first place."

    How about bringing the service parameter to the beginning of the sentence. If 「Amazon」 is the first word that gets typed is the service then you can deduce that the verb at the end of the sentence is going to be 「買う」or「検索」. I know this makes parsing the sentence difficult because the system is going to have to have different logic than English but isn't it good news that everything between the service name and verb is in the same order?

    I have a few example comparisons between English (with verb first, service names last) commands and Japanese (with service name first, verb last) to show how what is in between is the same order: http://keisukeomi.com/ubiquity-in-firefox/japanes...

    The image also shows how the system can encourage the user to type in commands with a specific order while keeping the perception of free-form input. The idea is that the grey text would be displayed at key points of input and would act as a prompt/reminder for what should come next.

    I think there are other changes to Ubiquity that needs to be though about so it feels native to Japanese users.

    1) ウビクイテー? ユビクイティー? The name "Ubiquity" sounds great in English but it's just so awkward in Japanese. It's not a common word that you learn in school (I assume) so no one knows what it means too. I think if mass adoption in Japan (and other parts of the world where "Ubiquity" isn't part of their vocabulary) is a goal then an alternative name will go a long way.

    2) The 変換候補 panel that gets displayed when entering Japanese gets drawn behind the Ubiquity panel. Maybe this is a system-wide problem with panels in Firefox? See screenshot here: http://slashcolon.com/wordpress/2008/09/08/ubiqui...

    That being said, this is looking great! I'm enjoying following the design process for a feature that requires thinking about localization at such a fundamental level. I feel localization is only skin deep for most software/webapps because teams assume that replacing text strings and images is enough. With Ubiquity, you have no choice but to explore how much of the core functionality has to be customized for each local to make it appear native to whomever is using it.

  6. Aza Raskin Says:

    ラーメンCommandっていいよね!日本に行くと絶対使うよ。

  7. mitcho Says:

    I didn't make this clear in the above mockup, but a sentence with the -に argument before the -を would work just as well for this system.

    As for the ラーメン、 that's a really interesting question. This also has to do with how grammaticalized different services are as verbs… for example, in English you can "map" things or "google" things and, traditionally, you can't do either as verbs… you must "find on a map" or "search with google" (although マッピングする for map can be heard more and more so I used it in this mockup.) This also has to do with Jono's suggestion for "overlord verbs" if you're interested.

  8. marsf Says:

    > a sentence with the -に argument before the -を would work just as well for this system.
    I know this well. The ticket #425 is my post. :-) https://ubiquity.mozilla.com/trac/ticket/425

    My point is, current JP parser system doesn't handle each particles. It's just used for separating a given sentence. All JP commands for JP paser must indicate arguments by using "modifier" (can't use direct input). There is no compatibility in same command for each languages.

    IMHO, the "NULL" particle is needed to solve compatibilitiy problem.

  9. Verb-final languages: an advantage? « Not The User’s Fault Says:

    […] | Tags: internationalization, linguistic UI, ubiquity |   Over at his blog, Mitcho has some very sharp thoughts about localizing Ubiquity to verb-final languages such as […]

  10. links for 2009-02-23 « 個人的な雑記 Says:

    […] mitcho > blog > Ubiquity in Firefox: Focus on Japanese (tags: ubiquity) […]

  11. kourge Says:

    I know that this has less relevance in verb order, but there seems to be a hidden technical issue regarding the input of Asian languages on any text field in Firefox. The text field is not aware of any new input from the IME until the IME commits the text, and that may well cause the problem of Ubiquity not being able to scan what text is inputted into the URL bar (or whatever text field it's using) when the user is using an IME.

    There is also a problem with keystrokes like Esc, because Esc usually clears the input buffer of the IME. I've seen the same problem in Songbird's online translation system; it was designed to save the current string and jump to editing the next string when the user hits the Enter key, interfering with the IME committing the input buffer. I'm not sure why is it that these keystroke events leak to the text fields while the text in the input buffer does not.

    In any case, it would be fun to see if it is productive to force Chinese into OVS order using a (highly unnatural) passive voice. I talked to a native German speaker a few days ago and she and I were chatting about how it was interesting that German speakers usually place the main verb at the end of a sentence, almost creating an anticipating effect.

  12. mitcho Says:

    @kourge, there may be some differences in behavior between platforms and IMEs… on my machine here (OS X Kotoeri on Firefox 3.1) Ubiquity recognizes and can start parsing text which hasn't left the Japanese conversion yet. As for the escape key, though, you're right that that may cause some trouble. We'll definitely need to look into these issues with different IME/OS/language combinations in the future if there is to be a future for this kind of quick feedback cycle.

  13. mitcho Says:

    Hmm… not sure about this… did you try manually setting the text encoding? It should be UTF8.

  14. mitcho Says:

    Hi Georg, thanks for your comments. ^^

    You can find some of my thoughts on trying to approach this "natural syntax" here. The upshot is that if Ubiquity is to gain a wider audience, we don't want to have to force users into a partial English, especially for users who have no prior knowledge of English and are not interested in learning a "programming language." Whether this is a viable goal or not is another issue… but one that we won't resolve unless we try. ^^ There have indeed been efforts to make different language-based programming languages… there's some info on that on wikipedia.

    I agree that the verb-suggestion based on nouns is an interesting model, and that if fruitful it should also apply to English and other verb-initial (or at least not verb-final) languages. Jono points out that the English parser in Ubiquity indeed already has this to a certain extent… I encourage you to check it out.

  15. kourge Says:

    Interesting, I'm on OS X, 10.5.5, Kotoeri, Firefox 3.0.6, and the URL bar, for example, can't find whatever is in my history until I commit the text. The same applies to OpenVanilla, and I'm too lazy to try Hanin. I'm guessing that it could be that you're using Firefox 3.1.

  16. kourge Says:

    Oops, I forgot to reply to that thread and replied to the post instead.

  17. User-Aided Disambiguation: a demo Says:

    […] few weeks ago I made some visual mockups of how Ubiquity could look and act in Japanese. Part of this proposal was what I called “particle identification”: that is, immediate […]

  18. Ubiquity Parser: The Next Generation Demo Says:

    […] (1) the use of overlord verbs, (2) writing verbs by semantic roles, and (3) better suggestions for verb-final languages and other argument-first contexts. I’m happy to say that I’ve spent some time putting a […]

  19. Cirurgia Plastica Says:

    Japanese is a complicated, but indeed abeutiful language.

  20. Scoring and Ranking Suggestions Says:

    […] input first-class citizens of Ubiquity, improving their suggestions in particular to the benefit of verb-final languages. Arguments will be split up and tested against different noun types before a verb is even entered […]

  21. NOSE Takafumi » Bagel2について (その2) Says:

    […] Ubiquity in Firefox: Focus on Japanese […]

  22. Ubiquity Localization: What’s New, What’s Next Says:

    […] 2 also adds better argument-first suggestions, inspired by some earlier thoughts on Ubiquity in Japanese. Ubiquity will now start to parse arguments in the input even if a verb isn’t found, and […]

  23. Online High School Says:

    Firefox 3.0.6, and the URL bar, for example, can't find whatever is in my history until I commit the text. The same applies to OpenVanilla, and I'm too lazy to try Hanin. I'm guessing that it could be that you're using Firefox 3.1.

  24. Condos Florida Says:

    Off course Japanese one of the key and interesting and Japanese also good market in world.

  25. Live Scores Says:

    I talked to a native German speaker a few days ago and she and I were chatting about how it was interesting that German speakers usually place the main verb at the end of a sentence, almost creating an anticipating effect.

  26. live soccer scores Says:

    Enter key, interfering with the IME committing the input buffer. I'm not sure why is it that these keystroke events leak to the text fields while the text in the input buffer does not.

  27. los angeles dj Says:

    I am certain this is a very unique piece of work. The Japanese grammar and sentence structures are quite complex in written forms, and great to see you take on the challenge. electronics store usa

  28. Vintage china vase Says:

    This is a great tool for Firefox browser. I use it some time when I read Japanese pages. Thanks for sharing about this————————————Vintage china vase || colon cleanse detox

  29. Vintage china vase Says:

    This is a great tool for Firefox browser. I use it some time when I read Japanese pages. Thanks for sharing about this————————————Vintage china vase || colon cleanse detox

  30. Vintage china vase Says:

    This is a great tool for Firefox browser. I use it some time when I read Japanese pages. Thanks for sharing about this————————————Vintage china vase || colon cleanse detox

  31. victor Says:

    Japanese language is very tough to understand and frame. Wedding Dresses Wedding Flowers Vera Wang Wedding Dresses Mori Lee Prom Dresses

  32. auctions Says:

    Japanese language is very interesting to learn but at the same time it is very hard..Thanks for the great article

  33. Dating Says:

    Learning Japanese can be a bit difficult but how difficult it will be in programming..Thanks for this great learning tutorial

  34. sears coupon codes Says:

    I'm on OS X, 10.5.5, Kotoeri, Firefox 3.0.6, and the URL bar, for example, can't find whatever is in my history until I commit the text. The same applies to OpenVanilla, and I'm too lazy to try Hanin. I'm guessing that it could be that you're using Firefox 3.1.

  35. Double Trouble Says:

    I have been studying Japanese for a long time. I think its the most beautiful and interesting language in the world.

    ありがとうございます

  36. link building Says:

    you're right that that may cause some trouble. We'll definitely need to look into these issues with different IME/OS/language combinations in the future if there is to be a future for this kind of quick feedback cycle. link building services

  37. easy diets Says:

    I once though of learning Japanese when I was early twenties but no more. Now I don't work in Japanese company anymore. Time flies - 25 years.

  38. movie download free Says:

    Japanese is a complicated, but indeed abeutiful languages

  39. Mmorpg Says:

    A lot of Sites with japanese characters don't load for me =[

  40. xiang Says:

    I have been studying Japanese for a long time. I think its the most beautiful and interesting language in the world.

  41. xiang Says:

    I have been studying Japanese for a long time. I think its the most beautiful and interesting language in the world.

  42. Doherty & Catlow  Says:

    It surely seems like it anyways! I am sure glad I'm not the one trying to get this figured out, I'd probably go insane.

  43. Mozzila Says:

    Creative work is play. It is free speculation using materials of one's chosen form. The art of dining well is no slight art, the pleasure not a slight pleasure. It's takin' whatever comes your way, the good AND the bad

  44. San Diego DUI Lawyer Says:

    It's interesting to see this in another language, especially Japanese! Seems like getting this to work 100% will take a lot of time and patience.

  45. expedia Says:

    I look forwards ot a truly ubiquitous experience

  46. Pest control tulsa Says:

    This is something that I will find very useful, thanks!

  47. neil Says:

    I like the treaded arguement that the programming shouldn't necesssarily follow a english model, or translating from english to Japanese. How would it work if everything started from Japanese and moved on from there? Or at that point, would we be talking about an entirely different thing? Do you need to be able to back from English to Japanese and vise versa?

  48. unnamed Hero Says:

    Your site is very successful continuous tracking. ;)

  49. Hermes Handbags Says:

    Thanks for sharing these info with us! I was reading something similar on another website that i was researching. I will be sure to look around more. thanks

  50. designer sunglasses Says:

    I am very happy to find a very good article related to firefox. Firefox is the leader of browsers on internet. :)

  51. balunov6 Says:

    Thanks for share your thoughts. Great thinking, good post.

  52. online appointments Says:

    A nice blog.i think that this session will take a good, hard look at everything that's good

  53. Employment Says:

    current JP parser system doesn't handle each particles. It's just used for separating a given sentence. All JP commands for JP paser must indicate arguments by using "modifier" (can't use direct input). There is no compatibility in same command for each languages.

    IMHO, the "NULL" particle is needed to solve compatibility problem.

  54. Online appointments Says:

    that's really a fantastic post !  ! added to my favourite blogs list..

  55. Pest control Austin Says:

    A charged particle loses energy at a rate determined in part by its velocity. The energy loss per unit distance is typically called dE/dx.

  56. promotional clothing Says:

    Japanese language has not sequenced order. Japanese is indeed very interesting language. Probably I should try to learn it =)

  57. SEO Says:

    Why won't my Firefox browser automatically take me to a website by typing one word in the URL but others' Firefox will?

  58. video converter Says:

    Thank you!

  59. Unlimited Soft Says:

    Very interesting!

  60. picture to painting Says:

    A friend of mine in high school was from Japan and tried their best to help me learn some Japanese. I tell you what, though… that is one hard language to learn. Give me Spanish any day!

  61. DTV Antenna Says:

    I now have your site bookmarked…your site always has great and interesting articles to read..thanks!

  62. Mary Lynn Smith Says:

    Japanese is such a beautiful language as well as an ancient language. We have a large oriental population and it has been a pleasure to be in their company New Orleans Escorts

  63. Smart Smoker Says:

    I agree, and the country japan and also beautiful country!

  64. St Martin Villas Says:

    Ubiquity is an awesome tool for firefox! I'll share this with my Japanese friends!

  65. Georg Says:

    Wow, I sure would like to know the meaning behind the symbols. online canadian

  66. blu ray Says:

    thank you

  67. capsiplex chili Says:

    There are many Japanese words that have no literal translation to english. It is a very beautiful language however.

  68. Jarquel Says:

    Wow! Very nice!

  69. Rose Says:

    Me to trying to learn but can't find good language center in my country. That is why I am trying to learn language from online media. Can any suggest me the online Japanese Language Learning Website.

    Thanks in Advance. Free Plant

  70. Seo Says:

    Search engine optimization -Seo Online Shop, Provide Search engine optimization and seo services uk,we Offer Guaranteed Search Engine Optimization Results And Search Engine Ranking On all The Major Search Engine.

  71. seo uk Says:

    Internet marketing Company - Seoonlineshop an internet marketing company, offers SEO Services, Link Building, Article Submission, Social Bookmarking and Directory Submission Services.

  72. seo uk Says:

    Internet marketing Company - Seoonlineshop an internet marketing company, offers SEO Services, Link Building, Article Submission, Social Bookmarking and Directory Submission Services.

  73. seo uk Says:

    Internet marketing Company - Seoonlineshop an internet marketing company, offers SEO Services, Link Building, Article Submission, Social Bookmarking and Directory Submission Services.

    Seoonlineshop.com

    Search Engine Optimization - Professional SEO Services - providing affordable SEO to businesses. At SEOOnlineShop.com we specialize in professional Search Engine Services.

    dIRECTORY SUBMISSION SERVICES - Directory submission services -Seo online Shop provide Quality Manual Directory submission services to high PR directories In affordable Price, Quality and Satisfaction is guaranteed.

  74. Abnehmen Says:

    ラーメンCommandっていいよね!日本に絶対使うよ。

  75. Maxim Says:

    Hadnt know that there is a tool like this…thanks for the nice read :)

  76. Darlehensportal Says:

    I have been brought up with two languages and had learned later two further ones. But all these are of European origin and I find it just amazing and fascinating how all the different languages worldwide do differ from each other.

  77. Hard Money Lenders Says:

    Wow I never thought that there would be so much involved because of the Japanese language.

  78. Mohsin Says:

    I loved FIrefox very much i dont use any other browser :)

  79. yasin Says:

    I desire to learn this language , actually just have to learn quickly weboptimization

  80. Heizung Says:

    ラーメっていいよね!日本に行くと絶対使うよ。

  81. Mal1 Says:

    Nice nice, thanks :-)

  82. pens parker Says:

    I gotta say Japanese is not easy to master especially the alphabet, but its well worth the effort

  83. HID CONVERSION Says:

    yes I like the treaded arguement that the programming shouldn't necesssarily follow a english model, or translating from eng to Japanese. How would it work if everything started from Japanese and moved on from there? Or at that point, would we be talking about an entirely different thing? Do you need to be able to back from English to Japanese and vise versa?

  84. Komono03 Says:

    Firstly I would like to thank you for some info you showed here. It is interesting to know. Also I like reading the conclusion you posted: "The properties and challenges of Japanese grammar require that we not try to outright copy the English behavior but to think about what really makes sense in that language and that may be an important lesson as we move toward designing a realizable Ubiquity". Furthermore, I am learning Japanese and I indeed like this language.


© 2006-2010 mitcho (Michael 芳貴 Erlewine).
Proudly powered by WordPress.
Entries (RSS) and Comments (RSS).
The views expressed on these pages are mine alone and do not
reflect those of my employers and clients, past and present.