<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>mitcho.com &#187; automation</title>
	<atom:link href="http://mitcho.com/blog/tag/automation/feed/" rel="self" type="application/rss+xml" />
	<link>http://mitcho.com</link>
	<description></description>
	<lastBuildDate>Tue, 07 Feb 2012 02:04:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4-alpha-19719</generator>
		<item>
		<title>Automating the Linguist&#8217;s Job</title>
		<link>http://mitcho.com/blog/projects/automating-the-linguists-job/</link>
		<comments>http://mitcho.com/blog/projects/automating-the-linguists-job/#comments</comments>
		<pubDate>Tue, 24 Mar 2009 08:57:58 +0000</pubDate>
		<dc:creator>mitcho</dc:creator>
				<category><![CDATA[projects]]></category>
		<category><![CDATA[analogy]]></category>
		<category><![CDATA[automation]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[deduction]]></category>
		<category><![CDATA[Dutch]]></category>
		<category><![CDATA[linguistics]]></category>
		<category><![CDATA[Mozilla Planet]]></category>
		<category><![CDATA[parser]]></category>
		<category><![CDATA[patterns]]></category>
		<category><![CDATA[ubiquity]]></category>

		<guid isPermaLink="false">http://mitcho.com/blog/?p=1634</guid>
		<description><![CDATA[At the end of my blog post yesterday I hinted at an exciting possible approach to Ubiquity&#8217;s localization: In the future we ideally could build a web-based system to collect these &#8220;utterances.&#8221; We could &#8230; generate parser parameters based on those sentences. That would essentially reduce the parser-construction process to a more run-of-the-mill string translation [...]
Related posts:<ol>
<li><a href='http://mitcho.com/blog/projects/ubiquity-i18n-questions-to-ask/' rel='bookmark' title='Ubiquity i18n: questions to ask'>Ubiquity i18n: questions to ask</a></li>
<li><a href='http://mitcho.com/blog/projects/localizing-ubiquity-an-open-letter-to-linguists/' rel='bookmark' title='Localizing Ubiquity: an open letter to linguists'>Localizing Ubiquity: an open letter to linguists</a></li>
<li><a href='http://mitcho.com/blog/projects/writing-commands-with-semantic-roles/' rel='bookmark' title='Writing commands with semantic roles'>Writing commands with semantic roles</a></li>
</ol>

Related posts brought to you by <a href='http://yarpp.org'>Yet Another Related Posts Plugin</a>.]]></description>
			<content:encoded><![CDATA[<p>At the end of <a href="http://mitcho.com/blog/projects/ubiquity-i18n-questions-to-ask/">my blog post yesterday</a> I hinted at an exciting possible approach to Ubiquity&#8217;s localization:</p>

<blockquote>
  <p>In the future we ideally could build a web-based system to collect these &#8220;utterances.&#8221; We could &#8230; generate parser parameters based on those sentences. That would essentially reduce the parser-construction process to a more run-of-the-mill string translation process.</p>
</blockquote>

<p>If we build this type of &#8220;command-bank&#8221; of common Ubiquity input translated into various languages, we could build a tool to learn various features of each language and generate each parser, essentially <em>learning the language based on data</em>. Today I&#8217;ll elaborate on how I believe this could be possible, by analogy to another language learning device: <strong>the human</strong>.</p>

<p><span id="more-1634"></span></p>

<h3>Step 1: learning words</h3>

<p>How does a human learn language? Without getting into any <a href="http://en.wikipedia.org/wiki/language acquisition">details or theory</a>, we can say that the input for a language learner is always a combination of <em>linguistic input and a referent</em>. In the case of a child, this could be a pairing of linguistic input with <em>real world stimulus</em>:</p>

<p><center></p>

<table style='border:none;'><tr><th>input</th><th>referent</th></tr>
<tr><td style='font-size:2em;color:orange;font-weight:bold;text-align:center;'>“taiyaki!”</td><td><img src='http://farm4.static.flickr.com/3543/3357452751_977fcce70c.jpg?v=0' width='300'/><br/>
by <a href='http://www.flickr.com/photos/makitani/3357452751/'>makitani</a> via <a href='http://creativecommons.org'>creative commons</a>.</td></tr>
<tr><td style='font-size:2em;color:orange;font-weight:bold;width:50%;text-align:center;'>“cat!”</td><td><img src='http://farm4.static.flickr.com/3285/2387513295_2768ddf662.jpg?v=0' width='300'/><br />
by <a href='http://www.flickr.com/photos/victoriachan/2387513295/in/set-72157604986983169/'>victoriachan</a> via <a href='http://creativecommons.org'>creative commons</a>.</td></tr>
</table>

<p></center></p>

<p>The human child will hear &#8220;cat&#8221; while looking at the cat and, with time and repetition, learn that that thing is called a &#8220;cat,&#8221; and <a href="http://en.wikipedia.org/wiki/taiyaki">some other thing</a> is called &#8220;taiyaki.&#8221;</p>

<p>Similarly, we could take single-verb data points from our command-bank to match new words with a know referent—in this case, the base English string. Here&#8217;s an example from <a href="http://jan.moesen.nu/">Jan&#8217;s</a> comment on <a href="http://mitcho.com/blog/projects/ubiquity-i18n-questions-to-ask/">yesterday&#8217;s sample survey</a>.</p>

<p><center></p>

<table style='border:none;'><tr><th>input (Dutch)</th><th>referent (English)</th></tr>
<tr><td style='font-size:2em;color:orange;font-weight:bold;text-align:center;'>zoek</td><td style='font-size:2em;color:blue;font-weight:bold;text-align:center;'>search</td></tr>
</table>

<p></center></p>

<h3>Step 2: deduction</h3>

<p>Now suppose we know some single words like &#8220;taiyaki&#8221; and &#8220;cat.&#8221; Consider the two situations. Given the first sentence and referent &#8220;mitcho&#8217;s eating a taiyaki,&#8221; the child could intuit the appropriate linguistic representation for the latter situation.</p>

<p><center></p>

<table style='border:none;'><tr><th>input</th><th>referent</th></tr>
<tr><td style='font-size:2em;color:orange;font-weight:bold;width:50%;text-align:center;'>“mitcho&#8217;s eating a taiyaki!”</td><td><img src="http://mitcho.com/blog/wp-content/uploads/2009/03/eattaiyaki.jpg" alt="eattaiyaki.jpg" border="0" width="300" height="225" /></td></tr>
<tr><td style='font-size:2em;color:red;font-weight:bold;text-align:center;'>???</td><td><img src="http://mitcho.com/blog/wp-content/uploads/2009/03/eatcat.jpg" alt="eatcat.jpg" border="0" width="300" height="225" /></td></tr>
</table>

<p></center></p>

<p>The process is simple. First note that there is only one variable changed between the two situations: the taiyaki has been replaced by a cat head. You can then construct the correct utterance <em>by analogy</em>, replacing &#8220;taiyaki&#8221; with &#8220;cat,&#8221; yielding &#8220;mitcho&#8217;s eating a cat!&#8221;<sup id="fnref:2"><a href="#fn:2" rel="footnote">1</a></sup></p>

<p>Similarly, we could build a tool to analyze the data in a translated command-bank to identify particular features of each language, generating at least basic parsers for each language. Such a task would require a number of <em><a href="http://en.wikipedia.org/wiki/minimal pairs">minimal pairs</a></em> in our data set—here&#8217;s one such example from yesterday&#8217;s survey (with Dutch data from <a href="http://jan.moesen.nu/">Jan</a>):</p>

<p><center></p>

<table style='border:none;'><tr><th>input (Dutch)</th><th>referent (English)</th></tr>
<tr><td style='font-size:1.5em;color:orange;font-weight:bold;text-align:center;'>zoek HELLO met Google</td><td>
<span style='font-size:1.5em;color:blue;font-weight:bold;'>search HELLO with Google</span><br/>
<code>
<pre>Parse {
  verb:      'search',
  arguments: {
    object:  ['HELLO'],
    service: 'Google'
  }
}</pre>
</code></td></tr>
<tr><td style='font-size:1.5em;color:orange;font-weight:bold;text-align:center;'>zoek dit met Google</td><td>
<span style='font-size:1.5em;color:blue;font-weight:bold;'>search this with Google</span><br/>
<code>
<pre>Parse {
  verb:      'search',
  arguments: {
    object:  ['this'],
    service: 'Google'
  }
}</pre>
</code></td></tr></table>

<p></center></p>

<p>A simple string analysis<sup id="fnref:3"><a href="#fn:3" rel="footnote">2</a></sup> would tell us that the text <code>HELLO</code> was replaced by <code>dit</code> in the latter Dutch sentence. Meanwhile, since the English reference sentence is chosen manually, we also know the appropriate parses for each of those sentences. An object difference operation would note that the <code>object</code> property was changed from a value of <code>'HELLO'</code> to <code>'this'</code>. We could then map <code>dit</code> to the English <code>this</code>. We&#8217;ve now learned one (of perhaps many) Dutch deictic pronouns (aka &#8220;magic words&#8221;).</p>

<p>Given <a href="http://mitcho.com/code/ubiquity/parser-demo/">an adequately universal but customizable parser design</a>, we can then develop tests for various parameters by constructing appropriate <a href="http://en.wikipedia.org/wiki/minimal pairs">minimal pairs</a> in the base sentences and having them translated.<sup id="fnref:1"><a href="#fn:1" rel="footnote">3</a></sup> As noted yesterday, such a system could reduce the laborious task of writing individual parsers to a task of string translation, which <a href="https://wiki.mozilla.org/L10n:Home_Page">our community does exceedingly well</a>. <strong>I&#8217;m eager to hear what others think of this approach. What concerns would you have for this approach? What potential benefits do you see?</strong></p>

<div class="footnotes">
<hr />
<ol>

<li id="fn:2">
<p>I mean no offense to human children with this simplified example. Surely you can learn more than just string replacements.&#160;<a href="#fnref:2" rev="footnote">&#8617;</a></p>
</li>

<li id="fn:3">
<p>I started building some string analysis toys in JavaScript today, such as a <a href="http://mitcho.com/code/ubiquity/levenshtein/">Levenshtein difference demo</a>.&#160;<a href="#fnref:3" rev="footnote">&#8617;</a></p>
</li>

<li id="fn:1">
<p>The linguists in the audience may note that this parser&#8217;s modular design is indeed in the spirt of the <a href="http://en.wikipedia.org/wiki/principles and parameters">principles and parameters</a> framework.&#160;<a href="#fnref:1" rev="footnote">&#8617;</a></p>
</li>

</ol>
</div>
<p>Related posts:</p><ol>
<li><a href='http://mitcho.com/blog/projects/ubiquity-i18n-questions-to-ask/' rel='bookmark' title='Ubiquity i18n: questions to ask'>Ubiquity i18n: questions to ask</a></li>
<li><a href='http://mitcho.com/blog/projects/localizing-ubiquity-an-open-letter-to-linguists/' rel='bookmark' title='Localizing Ubiquity: an open letter to linguists'>Localizing Ubiquity: an open letter to linguists</a></li>
<li><a href='http://mitcho.com/blog/projects/writing-commands-with-semantic-roles/' rel='bookmark' title='Writing commands with semantic roles'>Writing commands with semantic roles</a></li>
</ol>
<p>Related posts brought to you by <a href='http://yarpp.org'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://mitcho.com/blog/projects/automating-the-linguists-job/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Keep up with Yet Another Related Posts Plugin with RSS!</title>
		<link>http://mitcho.com/blog/projects/keep-up-with-yet-another-related-posts-plugin-with-rss/</link>
		<comments>http://mitcho.com/blog/projects/keep-up-with-yet-another-related-posts-plugin-with-rss/#comments</comments>
		<pubDate>Sat, 04 Oct 2008 16:43:34 +0000</pubDate>
		<dc:creator>mitcho</dc:creator>
				<category><![CDATA[how to]]></category>
		<category><![CDATA[projects]]></category>
		<category><![CDATA[automation]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[feed]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[plugin]]></category>
		<category><![CDATA[RSS]]></category>
		<category><![CDATA[WordPress]]></category>
		<category><![CDATA[YARPP]]></category>

		<guid isPermaLink="false">http://mitcho.com/blog/?p=789</guid>
		<description><![CDATA[As more and more people have been using my Yet Another Related Posts Plugin for WordPress, I thought it would be nice to have an RSS feed for users to stay on top of the latest releases. Yet Another Related Posts Plugin version log RSS 2.0 Clicking on a version&#8217;s permalink will let you download [...]
Related posts:<ol>
<li><a href='http://mitcho.com/blog/projects/yet-another-related-posts-plugin-20/' rel='bookmark' title='Yet Another Related Posts Plugin 2.0'>Yet Another Related Posts Plugin 2.0</a></li>
<li><a href='http://mitcho.com/blog/projects/yet-another-related-posts-plugin/' rel='bookmark' title='Yet Another Related Posts Plugin'>Yet Another Related Posts Plugin</a></li>
<li><a href='http://mitcho.com/blog/projects/modifiying-wordpress-plugin-activation-behavior/' rel='bookmark' title='Modifiying WordPress plugin activation behavior'>Modifiying WordPress plugin activation behavior</a></li>
</ol>

Related posts brought to you by <a href='http://yarpp.org'>Yet Another Related Posts Plugin</a>.]]></description>
			<content:encoded><![CDATA[<p>As <a href="http://wordpress.org/extend/plugins/yet-another-related-posts-plugin/stats/">more and more people</a> have been using my <a href="/code/yarpp">Yet Another Related Posts Plugin</a> for <a href="http://en.wikipedia.org/wiki/WordPress">WordPress</a>, I thought it would be nice to have an RSS feed for users to stay on top of the latest releases.</p>

<div class="files">
<div class="file rss">
<a href="http://mitcho.com/code/yarpp/yarpp.rss">Yet Another Related Posts Plugin version log</a><br />
<span class="specs">RSS 2.0</span>
</div>
</div>

<p>Clicking on a version&#8217;s permalink will let you download the plugin. Subscribe now and be the first to find out when the upcoming version 2.1 is released!</p>

<p>I decided to semi-automate this RSS-producing process as well. As a plugin developer using <a href="http://wordpress.org/extend">wordpress.org</a>&#8217;s plugin hosting, I sync a local copy of the plugin to their server using <a href="http://en.wikipedia.org/wiki/SVN">SVN</a>. I wrote a <a href="http://en.wikipedia.org/wiki/PHP">PHP</a> script to get the modification date information directly from the local files, parse the version log in the read me, and produce the RSS feed. If there&#8217;s an interest, perhaps I&#8217;ll release this code in the future.</p>
<p>Related posts:</p><ol>
<li><a href='http://mitcho.com/blog/projects/yet-another-related-posts-plugin-20/' rel='bookmark' title='Yet Another Related Posts Plugin 2.0'>Yet Another Related Posts Plugin 2.0</a></li>
<li><a href='http://mitcho.com/blog/projects/yet-another-related-posts-plugin/' rel='bookmark' title='Yet Another Related Posts Plugin'>Yet Another Related Posts Plugin</a></li>
<li><a href='http://mitcho.com/blog/projects/modifiying-wordpress-plugin-activation-behavior/' rel='bookmark' title='Modifiying WordPress plugin activation behavior'>Modifiying WordPress plugin activation behavior</a></li>
</ol>
<p>Related posts brought to you by <a href='http://yarpp.org'>Yet Another Related Posts Plugin</a>.</p>]]></content:encoded>
			<wfw:commentRss>http://mitcho.com/blog/projects/keep-up-with-yet-another-related-posts-plugin-with-rss/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

