blog

Posts Tagged ‘linguistics’

Testing Google’s Language Detection

Saturday, May 17th, 2008

google code

As Google adds ten more languages to its machine translation service, it seems to be on its way to becoming the most convenient universal translator of the world’s popular languages. Google’s handling of languages of course isn’t perfect, however—in particular, I’ve been complaining to friends for a while about the weaknesses of Google’s handling of queries in Chinese character (漢字/汉字) scripts. In this post, I run some tests using Google’s Language Detection service to try to better understand its handling of Chinese character queries.

Background

Chinese characters have been used all across East Asia, most notably in Chinese, Japanese, Korean, and Vietnamese (the “CJKV”). Prescriptivist writing reforms in Communist China and Japan have simplified many characters, though. Some characters were simplified in the same way, some in different ways, and some in only one country but not the other. For more information, there’s Wikipedia or Ken Lunde’s CJKV Information Processing.

The problem

The issue comes up when you try to search for a word in Chinese characters which clearly came from one Chinese character-using language. From my experience, Google doesn’t consider which language you are a user of, based on the query, and returns many results in other Chinese character-using languages as well.[^1]

(more…)

Linguistics in 嘉義

Tuesday, May 13th, 2008

A couple weeks ago I went to Chiayi (嘉義, pinyin: Jiāyì) to present a paper at the Linguistic Society of Taiwan’s National Conference on Linguistics.[^1] I got a chance to meet some wonderful and kind Taiwanese linguists, make friends with some linguistics students, as well as explore the city of Chiayi.

(more…)

White Protestants and Catholics don’t frequently attend religious services

Wednesday, February 13th, 2008

Breaking news from the Potomac Primaries:

White Protestants and Catholics backed Mrs. Clinton, but Mr. Obama was strongly supported by voters who frequently attend religious services.

Seeing as backing Mrs. Clinton and supporting Mr. Obama are, in terms of votes, mutually exclusive, this sentence entails that white Protestants and Catholics (the majority of ) are not a part of “voters who frequently attend religious services”, as is demonstrated by the infelicity of the following sentence:

“Group A did A, and Group B did not do A — but Group A is part of Group B.”

Well, that just settles it then.

Eats, shoots, and leaves

Monday, December 17th, 2007

I just read Clause and Effect (via DF), a great editorial discussing commas in the second amendment and their effects on interpretation of the law. I found this timely as Bailey and I just watched Institutional Memory, the penultimate episode of the West Wing, where Toby Ziegler discusses a comma in the fifth amendment’s takings clause: “nor shall private property be taken for public use[,] without just compensation.” BBC’s H2G2 has a pretty good write-up and there’s a listing of relevant links as well.

The funny thing about all of these is that we don’t speak commas. It’s used to graphically represent pauses in speech, but are often used according to certain artificial rules which, when used systematically, aim to help the reader parse the sentence or help disambiguate between different readings.1

I’m surprised Language Log hasn’t picked up this new piece yet. UPDATE: Yup, they got to it. Great coverage, as always.


  1. We use pauses in spoken language to do this too, but not necessarily in the same places that we place commas in “good” written language. 

ETA-ROC and another weekend in Taipei

Monday, November 12th, 2007

I spent this past weekend in Taiwan, attending the English Teaching Association of the Republic of China (ETA-ROC) conference. While the original intention was for a number of us ETA’s to go, it ended up that I went alone. I saw a number of talks Saturday… I went to a number of the more theoretical or quantitative talks and had a great time. I saw Krashen talk again, this time on the Comprehension Hypothesis. I have to say, he’s a fabulous speaker, and the case studies he looked at for this talk were fascinating: a Mexican immigrant who worked in a deli and learned Hebrew before he knew it, a culture where the rule is that you can’t marry someone who speaks the same language as you, etc. ^^ I also saw Andrew Cohen from Minnesota which made me miss Minnesota a bit.

The conference was held at the Chien Tan Youth Activity Center which has a beautiful pond and great view of the Grand Hotel, on the site of an old Shinto shrine.

IMG_9767IMG_9768 IMG_9770

As I recently did a little editing for a journal on English teaching here, I was invited to the presenters’ dinner Saturday night. While it was slightly awkward at first, not being a presenter myself, I soon met two representatives from the Korea and Philippines TESOL organizations who were very kind to me and we had some great conversations and laughs. (They are the two on the right in the first photo. The second photo is with the Filipino representative, Bernard Spolsky and me.)

PB100231PB100233

I stayed overnight Saturday at the Eight Elephants hostel. Less than a year old, Eight Elephants is stylish, clean, and comfortable, though not the cheapest hostel in town. My experience there was great… I made a friend, a student of Special Education from Kaohsiung, and we went out to the nearby Shida night market. After randomly running into Kate who was in Taipei with her host family, she took me to a cafe she knew and we had a great time talking. While her English is great as well, we were talking completely in Chinese. After spending the day thinking about comprehensible input, it was great listening to her, understanding about 80%, and chiming in once in a while. As her interests were teaching and learning languages (including Japanese), we hit it off well with some great conversation. I look forward to seeing her again when I visit Kaohsiung in the near future.

On Sunday morning I saw another talk by Andrew Cohen, had lunch, and met up with a couple of the interns at the Fulbright Taiwan foundation who showed me around Taipei. We went to the Chiang Kai-shek Memorial Hall and randomly ran into Dr. Wu Jing-jyi, the director of the Foundation, on the plaza. We then went to check out the Taipei Modern Art Museum (with the first .museum address I’ve ever actually seen), which was super cheap and very enjoyable, albeit being relatively small. (The last photo below is at the Taipei Story House, which is a historic building—we just took a picture outside without going in.)

IMG_9776IMG_9777IMG_9778 IMG_9779

We had some Hong Kong-style 燒臘 preserved meat for dinner. I came back to Nanao Sunday night feeling fulfilled and blessed by the people I’d met all weekend, at the conference, at the hostel, and around the city.

The Nerd Handbook

Monday, November 12th, 2007

From Rands in Repose’s Nerd Handbook, probably a good guide for Bailey (though I don’t quite fit the target completely):

But in nerds’ bit-based work, progress is measured mentally and invisibly in code, algorithms, efficiency, and small mental victories that don’t exist in a world of atoms.

I feel this phenomenon exists in formal linguistics as well, where the elegance of an analysis may be measured in theory-internal terms. It’s hard to get other people excited when they don’t share that same background, precisely as there is no physical manifestation of an analysis. At least Bailey’s good about listening, trying to understand, and being happy for me. ^^

(via Daring Fireball)

Krashen the party

Friday, November 9th, 2007

Yesterday we ETA’s went to a workshop at Lan-Yang Institute of Technology. The workshops were focused around the instruction of reading. The three afternoon sessions we saw included two workshops on building vocabulary and one by Stephen Krashen.

Krashen is kind of like the Chomsky of language acquisition and teaching—a huge and controversial (some may say incendiary) figure who you can love or hate, but can’t ignore. Last Wednesday in our weekly workshop, Dr. Collins delivered a chronological run down of Krashen’s theories.1 As an entertaining aside, one task given to us was to draw a schematic diagram of Krashen’s view of language acquisition and production. Below is Dale’s drawing, which eerily reflects the geography of the brain… the input comes in through the ears (or eyes, at the back of the brain), then hits the Affective Filter (the amygdala), goes to the Language Acquisition Device (the Broca’s and Wernicke’s areas), then the output is filtered by the Monitor—a product of conscious learning—(the frontal lobe). Pretty creepy.

dales-brain

Krashen’s talk2 was fascinating, albeit not what I expected: given that the workshop’s focus was on the teaching of reading and that he himself has been a big advocate of recreational reading for language learners, I expected more on teaching English reading as to non-native speakers. The majority of the talk, though, was on writing and the composing process: “reading more makes you a better writer, but writing more makes you smart.” He talked about how the act of (regular) writing clarifies and organizes our thoughts, and advocated for a writing process which involved much revision as, “every time you have to revise, it means you’ve become smarter,” and building relaxation (to allow for eureka moments) into the process. His conclusion and analysis are important for first-language speakers just as much as the second-language learner, and the talk did feel more like a writing seminar than a pedagogical one. Krashen is an engaging and entertaining speaker, using many examples from famous writers and common experience to draw his conclusion.

The intensity with which he spoke and the passion for thinking about thinking reminded me of Sally’s Honors Analysis class, which was as much about thinking as it was about mathematics. Sally once told us that, when we’re stuck on a problem, we should find someone just about as smart as us and just explain the problem to them. He claimed that the majority of the time, the simple process of explaining the problem outloud and answering clarifying questions would make the solution come to us. It’s a powerful technique that I’ve used many times at Chicago and elsewhere, and Krashen’s analysis of what happens when we write thus struck a chord with me.

Afterwards I was fortunate enough to go out to dinner with the speakers, some of our advisors, and some faculty from the Institute that hosted the workshop. I had some great conversations about my background, where my future directions may lie academically, and of course the ideas. ^^ It reminded me of dinners with linguists back at home, after a workshop or CLS. I realized I miss the fraternity of academia—the sense of mutual respect and interest academics have for each other’s work and ideas, even if the “other” is only 22 years old.


  1. A similar basic run down of Krashen’s various theories is found on this blog post, The Krashen Revolution

  2. Krashen, Stephen. “What is Academic Language Proficiency,” presented at the International Conference and workshops on English Language Teaching: Pedagogical Aspects of Reading. Yilan county, Taiwan, November 8th, 2007. 

24.863529, 121.801491

I’m busy to die

Tuesday, October 30th, 2007

Today at work: the military guy who has quite good English told me that he was very busy as our school is being observed next week by administrators. He then told me, “I’m busy to die.”

While I originally thought he might have mispronounced “today,” he obviously knows that word… I believe he was trying to say “,” a Mandarin resultative construction which could be translated “I’m busy to the extent that I will die.” Obviously this is not literal… V+ compounds are a common form of exaggeration. It was a neat instance of grammatical transfer, though.

Pinker wins, this time

Saturday, September 15th, 2007

An email from Bailey:

I heard Steven Pinker on NPR! Remind me to tell you about it (unless my excitement is not mutual). I wanted to call in and say “thanks so much for making linguistics accessible and interesting to us laypeople, I love the work that you do; my boyfriend recently received his Master’s in linguistics, but the stuff he works on is syntax in Mandarin Chinese, and it’s completely impenetrable.” But I didn’t. <3

Alas, this is the life I live.