mitcho Michael 芳貴 Erlewine

Postdoctoral fellow, McGill Linguistics.


Posts Tagged ‘code’

Every website has a purpose

Wednesday, June 2nd, 2010

Every website has a purpose. Maybe you want people to buy a product, donate to your cause, download your app, or subscribe to your mailing list. How can you confidently modify your site to make it more effective with respect to this goal?

A/B testing is a process by which multiple variants of a website are presented to different users randomly and statistical tools are used to see whether any variant is more effective, according to an overall goal metric such as conversions or revenue.

While various A/B testing products—many free—exist, none are made from the ground up to work within the WordPress ecosystem. I believe a solution made particularly for WordPress could make A/B testing so much easier and more straightforward, and that such a solution could be greatly beneficial to the platform as a whole.

I’m happy to announce my new project, code-named ShrimpTest,1 which is directly aimed at filling this void. I’ll be working on this project this summer together with the fantastic folks at Automattic.

The best way to keep up with development is on the project’s development blog, the ShrimpTest P2. Most updates will most likely be much shorter than this initial post. ;) You can get less frequent, milestone-like updates by following ShrimpTest on twitter. Development will be open so feel free to check it out (haha) and submit patches as well. As I go along, I’ll also look forward to your feedback.

  1. Five dollars to the first person to correctly guess why I’m calling it ShrimpTest. 

Better Linguist List RSS Feeds

Monday, April 26th, 2010

Everyone I know in linguistics uses the LINGUIST List website to a greater or lesser degree. Linguist List began as a mailing list in the 90’s, with book, job, and dissertation announcements, call-for-papers, and general academic discussions.

Nowadays many people follow the various announcements on Linguist List using an RSS feed reader such as Google Reader or my personal favorite NetNewsWire.

Unfortunately, the Linguist List RSS feeds (at least recently) don’t include the full text of the articles and have a few other quirks as well. It’s often hard to judge based on the title whether it’s really something I’m interested or not, so I’ve spent a lot of time frustratedly opening any possibly interesting-looking entry in a separate NetNewsWire tab. Today I decided enough was enough: I just wrote a script which parses each of the Linguist List RSS feeds, finds the actual descriptions and interleaves them.1 It’s working remarkably well so far:


  1. Veteran Linguist List RSS subscribers will also note that I’m adding the full title to the entry title for the Conferences and Calls lists as well. 

Spring is for Speaking: JSConf, WordCamp SF, IACL

Saturday, March 20th, 2010

I recently confirmed three different very exciting speaking gigs which I’ll be doing this spring:


Jetpacking in Boston

Saturday, March 13th, 2010

A couple weeks ago I gave a talk at the Boston Javascript meetup introducing Jetpack and filling people in in the latest developments in the project, including the Reboot. Between 20 to 30 people came to the talk which was at Microsoft Cambridge. Here are the slides from the talk:1

Extend the Browser with Jetpack


  1. If anyone would like the Keynote deck, just let me know. 

After the Deadline for Firefox

Monday, February 1st, 2010

After the Deadline is a powerful and intelligent proofreading tool which checks for spelling errors, misused words, some grammatical gaffes, and even some stylistic issues. For the past month, I’ve been working for Automattic, the company behind AtD and the makers of, to create a Firefox add-on which enables this superior technology everywhere on the web. Words can’t do justice to the magic that is AtD, so here’s a video we put together:

I invite you all to give it a spin:


Working on After the Deadline for Firefox gave me my first experience creating an add-on from the ground up and I’ve learned a lot. After working on Ubiquity and dabbling with Jetpack, it’s given me another perspective on extensibility on the web and I look forward to thinking and writing more about these experiences in the near future.

In the mean time, happy proofreading! :D

WordCamp Boston 2010

Friday, January 29th, 2010


This past weekend I gave a couple talks at the inaugural WordCamp Boston. WordCamps are local, community-organized events for WordPress users and enthusiasts. We had about 400 people at the Microsoft Cambridge campus.


Creating an image-sized iframe overlay with Shadowbox

Wednesday, January 13th, 2010

I recently have been working with the Shadowbox JavaScript library for an upcoming revision to the MIT Edgerton Digital Collections website. Shadowbox is a nice [[Lightbox (JavaScript)|lightbox]] library designed to work with various JavaScript libraries like jQuery, prototype, and mootools with a nice modular design.

Shadowbox is organized around different “players”—one for each kind of media that will be displayed. The library by default comes with players for Flash, HTML fragments, iframes, QuickTime, and Windows Media. Some of these players, like those for images and video, automatically recognize the media size and adjust the lightbox accordingly, while others such as the iframe player can use a set size or can fill the screen. For the Edgerton site, though, we had a need for displaying an iframe but in the dimensions of a set image, so that we could display the image with an overlay. Here are some notes on how to implement a custom player for Shadowbox.


Disgusting Word-formatted HTML and how to fix it

Wednesday, December 30th, 2009

In working on a new website for the MIT Working Papers in Linguistics, I recently inherited a collection of HTML files with all of our books’ abstracts. To my dismay (but not surprise) the markup in these files were horrendous. Here are some of the cardinal sins of markup that I saw committed in these files:

  1. Confusing ids and classes. ids should be unique on the page… but here’s an instance of using multiple instances of the same id in order to format them together.
<div id="indent"> <div id="number">4.2.1</div> <div id="page">161</div> <div id="section">Old French (Adams 1987)</div>
</div> <div id="indent"> <div id="number">4.2.2</div> <div id="page">164</div> <div id="section">The evolution of the dialects of northern Italy</div>
  1. Putting a class on every instance of something. Everything paragraph should be formatted equivalently. We get the point.
<p class=MsoNormal><b>The English Noun Phrase in Its Sentential Aspect</b></p>
<p class=MsoNormal>Steven Paul Abney</p>
<p class=MsoNormal>May 1987</p>
  1. Using blank space for formatting.
<p class=MsoNormal><o:p>&amp;nbsp;</o:p></p>
  1. CSS styles that don’t exist. Browsers just ignore these anyway…
<p class=MsoNormal>One factor in determining which worlds a modal quantifies
over is the temporal argument of the modal’s accessibility relation.<span
style='mso-spacerun:yes'>  </span>It is well-known that a higher tense affects
the accessibility relation of modals.<span style='mso-spacerun:yes'> 
</span>What is not well-known is that there are aspectual operators high enough
to affect the accessibility relation of modals.<span style='mso-spacerun:yes'> 

The solution

My solution was to write a perl script which takes care of a number of these issues. It’s not foolproof and doesn’t involve any voodoo—for example, it can’t retypeset things which were formatted using whitespace—but it does a good job as a first pass.

You can run the script by making it executable (chmod +x then specifying a target filename as an argument. For example,

./ source.html > clean.html

I used this with a simple bash for loop to run over all my files:

for f in */*.html; do ./ $f > ${f%.html}-clean.html; done;

Hopefully someone else can benefit from my experience.

Mashing up the browser in Maine

Saturday, December 19th, 2009

Last week I was invited to give a talk at the TechMaine annual conference in Portland, Maine.

Being a longer time slot than I previously have used to talk about Ubiquity, I decided to dedicate a good portion of the talk to Jetpack. Being outside of Mozilla for the past few months, this gave me an opportunity to get reacquainted with the Jetpack APIs. I myself was impressed by how easy it was to develop a quick Jetpack. I ended up preparing two to live-code during the talk: one called Helvetica which, with one click, replaces all fonts on the current page with Helvetica; and You Are Here which uses an open API from IPinfoDB to display the physical location of the domain you are currently visiting in the status bar. Both are now on the Jetpack Gallery.

Unfortunately there was a bit of a snowstorm leading up to the event, but there was still a nice turnout and I got to meet some fantastic people there. Ken Shoemake of [[slerp]] and [[quaternion]] fame came up to me after my talk and said “the Ubiquity parser reminded me of the dancing bear… it’s less surprising that it works well as that it works at all.” :) I also enjoyed the other great presentations in the technology track, covering the virtues of REST and basic iPhone development.

Mashup the Browser with Ubiquity and Jetpack

Extending WordPress talk at the Boston WordPress Meetup

Tuesday, September 29th, 2009

Yesterday I gave a talk at the Boston WordPress Meetup. The Boston WordPress Meetup meets monthly at the Microsoft Cambridge Research Center which is a fantastic venue right on the Charles river. Last night we got to be up on the 10th floor which has a great view of Boston right over the river. There was pretty good turnout, with about thirty or fourty people there.

My talk was a general introduction to WordPress plugin development, beginning with the concepts of actions and filters, and concluding with a description of HookPress, my new plugin which enables webhooks in WordPress. Here are the slides:


Exploring Command Chaining in Ubiquity: Part 2

Sunday, August 23rd, 2009


I recently have begun giving serious thought to what command chaining might look like in Ubiquity and the various considerations which must be made to make it happen. The “command chaining,” or “piping,” described here always involves (at least) two verbs acting sequentially on a passed target—that is, the first command performs some action or lookup and the second command acts on the first command’s output.

A few days ago I penned some initial technical considerations regarding command chaining. In this post I’ll be point out some linguistic considerations involved in supporting a natural syntax for chaining.


The Ubiquity Persistence Project: exploring a persistent Ubiquity in the toolbar

Thursday, August 20th, 2009

It’s often hard to remember Ubiquity’s presence and keystroke without a visual reminder—even I often forget that I could use Ubiquity and end up going to a search engine or using the search bar for some quick lookup task. What if the Ubiquity input were in the toolbar and always visible? How would that affect people’s use of Ubiquity? And what could we make that look like and how would it behave? Today we’re kicking off the Ubiquity Persistence Project, a new Ubiquity initiative to explore what a persistent Ubiquity might look like in the Firefox toolbar.


In order to facilitate this discussion, we created the Persistence tool. With the Persistence tool you can quickly try out new design and interaction ideas, mocking things up with some simple jQuery-powered JavaScript and CSS and see your changes live. The Persistence tool is bundled with our latest Ubiquity beta (install link).

The Ubiquity Persistence Project: exploring a persistent Ubiquity in the toolbar from mitcho on Vimeo.

I just put together a screencast introducing the initiative, demoing the Persistence tool, as well as talking about this project’s relation to the ongoing work on Taskfox. We’ll look forward to your comments and designs! :D

Exploring Command Chaining in Ubiquity: Part 1

Wednesday, August 19th, 2009

Since the dawn of time people have been asking about command chaining in Ubiquity. If you have a translate command and an email command, it would be great to be able to, for example, translate hello to Spanish and email to Juanito. This is what we call command chaining or [[Pipeline_(Unix)|piping]]: in a single complex query, specifying multiple (probably two) actions and using the first’s output as the second’s input.1

Today I hope to cover some of the technical considerations required in implementing command chaining in Ubiquity, and I will follow up soon with a blog post on the linguistic considerations required as well.


  1. We’re going to limit our discussion here to this restriction that the two verbs are not simply two simultaneous commands, but two commands which operate successively on an input, i.e., that it is true piping. This for example rules out input such as google dogs and translate cat to Spanish, as the second command’s execution does not semantically depend on the first’s execution. This (hopefully uncontroversial) decision also affects the linguistic considerations to be made in my next post. 

Performance vs Responsiveness —or— How I Made the Parser Twice As Fast in One Day

Thursday, August 13th, 2009

Since we launched Ubiquity 0.5, the issue of Parser 2 performance has been brought up over and over within the community. By virtue of having a more flexible and localizable design, Parser 2 was expected to be slower than our original parser, but its current implementation felt noticeably—perhaps unnecessarily—slow compared to Parser 1. Parser 2 performance has been identified as one of the blockers for pushing Ubiquity 0.5+ to all of our 0.1.x users, and has thus been one of my recent foci.

The short story:

Inspired by some comments by Blair, yesterday I was able to make significant (roughly 100%) performance gains in Parser 2, resulting in 40-60% faster parses, depending on the query. This change has been committed and will be released as part of our forthcoming minor update, Ubiquity 0.5.4. Yay!


HookPress: Webhooks for WordPress

Thursday, August 6th, 2009

I recently have spent a little time putting together a new WordPress plugin called HookPress. HookPress lets you add webhooks to WordPress, letting you easily develop push notifications or extend WordPress in languages other than PHP.

WordPress itself is built on a powerful plugin API which provides actions and filters. Actions correspond to events, so you can set a webhook to fire when a post is published or a comment is made.1 Filters let you modify some text when it is saved or displayed, so you can have your external webhook script reformat some text or insert some other content dynamically. Not all actions and filters are supported at this time, but I will continue to add more in.

There’s a webhooks meetup in San Francisco today but I unfortunately left SF this morning, so I created a video which will be played there as a lightning talk. A demo of both types of webhooks are in the video as well.

HookPress: add webhooks to WordPress from mitcho on Vimeo.

I’m really excited by this very simple but potentially high-impact plugin. I’d love to get your comments and feedback on this new plugin and hope to hear how you’re using HookPress!

ADDENDUM: Please also follow HookPress on twitter!

  1. My friend Abi actually has already blogged about HookPress and how it can be used to tweet on post publication