tag:blogger.com,1999:blog-84014572083132879632024-02-18T23:01:18.011-08:00CodeSith's BlogCodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.comBlogger47125tag:blogger.com,1999:blog-8401457208313287963.post-17764770250395392582013-05-22T21:55:00.002-07:002013-05-22T22:01:02.245-07:00Why is it important to follow "open standards"An open standard is a standard that is publicly available and has various rights to use associated with it, and may also have various properties of how it was designed (e.g. open process). There is no single definition and interpretations vary with usage. --- Wikipedia<br />
<br />
Why do I promote "open standards" in my team? <br />
<br />
<b>1. To avoid reinventing the wheel. </b><br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCnuhjQY8fCRD5SXocyWAgQ4mjEhT98jEqkqH4FUcZMooZp4n_d0KP3rBPgbFfjm04RIvu-onaENFjMF4avHRu2Hdtmhr3BG8gM6HmH8qxDE4wxWVmTLgotr9lsOKZFfdSEEzuws49-R2E/s1600/reinvent-the-wheel.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="200" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjCnuhjQY8fCRD5SXocyWAgQ4mjEhT98jEqkqH4FUcZMooZp4n_d0KP3rBPgbFfjm04RIvu-onaENFjMF4avHRu2Hdtmhr3BG8gM6HmH8qxDE4wxWVmTLgotr9lsOKZFfdSEEzuws49-R2E/s200/reinvent-the-wheel.jpg" width="170" /></a>If a public solution works well enough that many people have already adopted to it, there's no reason to rebuild it. For example, major programming languages have already implemented merge sort, or quick sort. I would not want my developers to rewrite those sort algorithms other than during interviews.<br />
<br />
<b><br /></b>
<b><br /></b>
<b><br /></b>
<b><br /></b>
<b><br /></b>
<b><br /></b>
<b>2. Free test coverages.</b><br />
Open standards are usually used by many individuals and organization. If I write a library function that closes file handles "silently", I have to write a lot of unit tests. The test team will have to write a lot of functional tests. We all have to run tests in multiple environments (linux, mac, windows, etc). We also need to create many stress testcases to cover failures. Apache common IO util offers this function, which has been tested and used by many developers. I don't need to test it as much.<br />
<br />
<b>3. Open standards usually work well with each other.</b><br />
Unlike "proprietary standard", "open standard" is meant to be shared. Therefore, many open standards work very similarly. Recently, one of my devs ported a traditional web services application over to a smart compute grid infrastructure. He made less than 500 lines of code change , only because both of the original web services container and the compute grid share the same "open standard". <br />
<br />
<b>4. Open standards expand one's horizon.</b><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDEiACUzCsY49jrOXdN86ptnj-JmXkW9eBTde2rnBfX7_JObI9Ux0y83dqHSPWZZVv0PyUh9X6Lt-13XfjXSxlK2RVdfQ3tw5jHX9Ehr-PkkDGST_lafpuTzf60W52bv2vhXSomppCoNKF/s1600/frog+in+the+well+puzzle.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="149" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhDEiACUzCsY49jrOXdN86ptnj-JmXkW9eBTde2rnBfX7_JObI9Ux0y83dqHSPWZZVv0PyUh9X6Lt-13XfjXSxlK2RVdfQ3tw5jHX9Ehr-PkkDGST_lafpuTzf60W52bv2vhXSomppCoNKF/s200/frog+in+the+well+puzzle.jpg" width="200" /></a></div>
A lot of developers who have been working with certain proprietary technologies are fairly narrow-minded. I have seen enough .Net developers who believe that every data storage related problem should be solved by SQLServer. And many C#-only developers do not know IoC (inverse of control). SQL Server is a solid relational DB, and C# is a useful language. However, there are plenty of alternative open technologies are equally good, or even better in certain cases. <br />
<br />
<br />
<b>5. Knowing open standards make you more marketable.</b><br />
For all the reasons above, and from my <a href="http://codesith.blogspot.com/2012/07/how-marketable-are-you.html" target="_blank">previous post</a>, everyone should learn open standards to make them more marketable.CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-2745223634033171872013-05-05T17:51:00.002-07:002013-07-05T21:51:37.584-07:00Test in Production!<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZWrpmV5azXvErIFxisjkN4Z3y-FrMnH-jlAoVU-3HKbYxG2uRjAlpzrgVBag0dyd2IvPgFzPX1VerPcSHOmcNeYACz4dnWOK0iIrO0kkR3Ze_Ace1ahfG26qxyj4reLV0xO5YLQhjXjIK/s1600/test.png" imageanchor="1"><img border="0" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZWrpmV5azXvErIFxisjkN4Z3y-FrMnH-jlAoVU-3HKbYxG2uRjAlpzrgVBag0dyd2IvPgFzPX1VerPcSHOmcNeYACz4dnWOK0iIrO0kkR3Ze_Ace1ahfG26qxyj4reLV0xO5YLQhjXjIK/s400/test.png" width="320"></a><br>
<br>
Anyone has a problem with the picture above? I certainly don't! However, I'll add this, "when I test on production, I test it carefully".<br>
<br>
In software service development, testing is so critical. Services often go through multiple tiers of testing environments before the bits are finally released to the customer. But why can't we do a one-stop testing directly on production? <br>
<br>
The typical answer to this question is "NO, YOU CANNOT IMPACT CUSTOMERS!". This is because the traditional practice of "testing on production" is to take a slice of the production traffic, and to put a pre-release version service behind the VIP along with all other current version services. This way, only a small portion of customers are impacted. But some customers ARE impacted!<br>
<br>
This is where "port forwarding" comes in handy. To ensure that NO customers are impacted at all, we can "clone" a slice of the production traffic, and send them over to a pre-release version along with all current version services. The cloned traffic is only one-way: it goes into the pre-release version service, but never returns back to the customer. This way, NO customers are impacted by the behavior of "un-tested" pre-release service.<br>
<br>
What do we get out of this?<br>
<br>
<ul>
<li>Free stress test! Instead of trying to setup stress testing environment that simulates production volumes and requests patterns, you will have a service running in production, getting production requests and throughput.</li>
<li>Free regression test! If you compare the responses from both current version service and pre-release version service, you get yourself a simple regression test suite.</li>
<li>Frequent regression test! Hook this into your favorite CI framework, you get to run regression tests 24/7, given you have production traffic 24/7.</li>
</ul>
<div>
There are multiple ways to do "port forwarding", advanced load balancers have built-in features to clone a subset of requests. If you are running linux, you can just configure IPtables to forward specific ports. Since I'm not a sysadmin, I prefer software solutions that I can modify and manage. </div>
<div>
<br></div>
<div>
Node.js has a nice plugin <b><a href="https://github.com/nodejitsu/node-http-proxy" target="_blank">node-proxy</a></b>. You can run a proxy service that bridge traffic to a "target" service, and a "forwarding" service. The target service is the one that handles real traffic, its responses are sent back to the customer through proxy. The forwarding service is only one-way. It gets the same requests as the target service, but never returns anything back to the customer. With this setup, you can TEST on PRODUCTION!</div>
CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-60581527858616793972012-12-23T21:58:00.002-08:002012-12-23T22:00:34.104-08:00Create FilterChain in node.jsI often try to learn something interesting (mostly programming related) whenever I get a long break from work. Last Thanksgiving, I wrote an iPhone app that syncs photo among S3, Flickr and Facebook. This Xmas, I took on writing my first node.js app.<br />
<br />
I used http://www.nodebeginner.org/ as a starting point. 5 minutes into the tutorial, I encountered a strange problem. For every request sent from chrome browser to my node.js server, my service recorded <b>TWO</b> requests. A quick google search indicated that chrome ALWAYS sends an additional "/favicon.ico" request to an HTTP server if it cannot locate an icon for that server.<br />
<br />
This was really annoying because it messed my global debugging counter. The solution was simple: just ignore all requests in the format of "/favicon.ico". But "SIMPLE" solutions are no fun, especially during learning process. If I were to do this in java, I'd use ServletFilter to "preFilter" out all unqualified requests. So I put my javascript and java skills to the test, and wrote this simple FilterChain function in node.js. Enjoy!<br />
<br />
<br />
<div class="p1">
<span style="font-size: x-small;"><span class="s1">var</span> http = <span class="s1">require</span>(<span class="s2">"http"</span>);</span></div>
<div class="p2">
<span style="font-size: x-small;"><br /></span></div>
<div class="p3">
<span style="font-size: x-small;">/*</span></div>
<div class="p3">
<span style="font-size: x-small;">* Manage all filters.</span></div>
<div class="p3">
<span style="font-size: x-small;">*/</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="s1">var</span> filterChain = {</span></div>
<div class="p1">
<span style="font-size: x-small;"> filters: <span class="s1">new</span> Array(),</span></div>
<div class="p1">
<span style="font-size: x-small;"> add: <span class="s1">function</span>(filter) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">this</span>.filters.push(filter);</span></div>
<div class="p1">
<span style="font-size: x-small;"> },</span></div>
<div class="p1">
<span style="font-size: x-small;"> applyAll: <span class="s1">function</span>(request, response) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">this</span>.apply(request, response, 0);</span></div>
<div class="p1">
<span style="font-size: x-small;"> },</span></div>
<div class="p1">
<span style="font-size: x-small;"> apply: <span class="s1">function</span>(request, response, i) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">if</span> (i == <span class="s1">this</span>.filters.length) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">return</span> processRequest(request, response);</span></div>
<div class="p1">
<span style="font-size: x-small;"> }</span></div>
<div class="p2">
<span style="font-size: x-small;"> </span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">var</span> filter = <span class="s1">this</span>.filters[i];</span></div>
<div class="p2">
<span style="font-size: x-small;"><br /></span></div>
<div class="p3">
<span style="font-size: x-small;"><span class="s3"> </span>// call preFilter and exits if fails</span></div>
<div class="p1">
<span style="font-size: x-small;"> console.log(filter.name + <span class="s2">".preFilter"</span>);</span></div>
<div class="p1">
<span style="font-size: x-small;"> success = filter.preFilter(request, response); </span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">if</span> (!success) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">return</span> false;</span></div>
<div class="p1">
<span style="font-size: x-small;"> }</span></div>
<div class="p3">
<span style="font-size: x-small;"><span class="s3"> </span>// call next filter and exits if fails</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> success = <span class="s1">this</span>.apply(request, response, i+1);</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> <span class="s1">if</span> (!success) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">return</span> false;</span></div>
<div class="p1">
<span style="font-size: x-small;"> }</span></div>
<div class="p2">
<span style="font-size: x-small;"> </span></div>
<div class="p3">
<span style="font-size: x-small;"><span class="s3"> </span>// call postFilter and exits</span></div>
<div class="p1">
<span style="font-size: x-small;"> console.log(filter.name + <span class="s2">".postFilter"</span>);</span></div>
<div class="p1">
<span style="font-size: x-small;"> success = filter.postFilter(request, response);</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">return</span> success;</span></div>
<div class="p1">
<span style="font-size: x-small;"> }</span></div>
<div class="p1">
<span style="font-size: x-small;">}</span></div>
<div class="p2">
<span style="font-size: x-small;"><br /></span></div>
<div class="p3">
<span style="font-size: x-small;">/*</span></div>
<div class="p3">
<span style="font-size: x-small;">* Filters</span></div>
<div class="p3">
<span style="font-size: x-small;">* All filters must implement 3 things:</span></div>
<div class="p3">
<span style="font-size: x-small;">* name - String unique name for this filter</span></div>
<div class="p3">
<span style="font-size: x-small;">* preFilter() - Executed before processing request</span></div>
<div class="p3">
<span style="font-size: x-small;">* postFilter() - Executed after processing request</span></div>
<div class="p3">
<span style="font-size: x-small;">*/</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="s1">var</span> faviconFilter = {</span></div>
<div class="p1">
<span style="font-size: x-small;"> name: <span class="s2">"favicon"</span>,</span></div>
<div class="p1">
<span style="font-size: x-small;"> preFilter: <span class="s1">function</span>(request, response) {</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> <span class="s1">if</span> (request.url === <span class="s2">'/favicon.ico'</span>) {</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> response.writeHead(200, {<span class="s2">'Content-Type'</span>: <span class="s2">'image/x-icon'</span>} );</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> response.end();</span></div>
<div class="p4">
<span style="font-size: x-small;"><span class="s3"><span class="Apple-tab-span"> </span> console.log(</span>'favicon request, filtered out!'<span class="s3">);</span></span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> <span class="s1">return</span> false;</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> } <span class="s1">else</span> {</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> <span class="s1">return</span> true;</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span> }</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span>},</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="Apple-tab-span"> </span>postFilter: <span class="s1">function</span>(request, response){<span class="s1">return</span> true;}</span></div>
<div class="p1">
<span style="font-size: x-small;">};</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="s1">var</span> latencyFilter = {</span></div>
<div class="p1">
<span style="font-size: x-small;"> name: <span class="s2">"latency"</span>,</span></div>
<div class="p1">
<span style="font-size: x-small;"> timer: null,</span></div>
<div class="p1">
<span style="font-size: x-small;"> preFilter: <span class="s1">function</span>(request, response) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">this</span>.timer = process.hrtime();</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">return</span> true;</span></div>
<div class="p1">
<span style="font-size: x-small;"> },</span></div>
<div class="p1">
<span style="font-size: x-small;"> postFilter: <span class="s1">function</span>(request, response) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> diff = process.hrtime(<span class="s1">this</span>.timer);</span></div>
<div class="p1">
<span style="font-size: x-small;"> console.log(<span class="s2">"<%s>%ds%dns"</span>, request.url, diff[0], diff[1]);</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">return</span> true;</span></div>
<div class="p1">
<span style="font-size: x-small;"> }</span></div>
<div class="p1">
<span style="font-size: x-small;">};</span></div>
<div class="p2">
<span style="font-size: x-small;"><br /></span></div>
<div class="p1">
<span style="font-size: x-small;">filterChain.add(latencyFilter);</span></div>
<div class="p1">
<span style="font-size: x-small;">filterChain.add(faviconFilter);</span></div>
<div class="p2">
<span style="font-size: x-small;"><br /></span></div>
<div class="p3">
<span style="font-size: x-small;">/*</span></div>
<div class="p3">
<span style="font-size: x-small;">* This is the actual function that processes the request</span></div>
<div class="p3">
<span style="font-size: x-small;">*/</span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="s1">function</span> processRequest(request, response) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> response.writeHead(200, {<span class="s2">"Content-Type"</span>: <span class="s2">"text/plain"</span>});</span></div>
<div class="p1">
<span style="font-size: x-small;"> response.write(<span class="s2">"Hello World"</span>);</span></div>
<div class="p1">
<span style="font-size: x-small;"> console.log(<span class="s2">"Response send."</span>);</span></div>
<div class="p1">
<span style="font-size: x-small;"> response.end();</span></div>
<div class="p1">
<span style="font-size: x-small;"> <span class="s1">return</span> true;</span></div>
<div class="p1">
<span style="font-size: x-small;">}</span></div>
<div class="p2">
<span style="font-size: x-small;"><br /></span></div>
<div class="p1">
<span style="font-size: x-small;"><span class="s1">function</span> onRequest(request, response) {</span></div>
<div class="p1">
<span style="font-size: x-small;"> filterChain.applyAll(request, response);</span></div>
<div class="p1">
<span style="font-size: x-small;">}</span></div>
<div class="p2">
<span style="font-size: x-small;"><br /></span></div>
<div class="p1">
<span style="font-size: x-small;">http.createServer(onRequest).listen(8888);</span></div>
<div class="p2">
<span style="font-size: x-small;"><br /></span></div>
<div class="p4">
<span style="font-size: x-small;"><span class="s3">console.log(</span>"Server has started."<span class="s3">);</span></span></div>
CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-29382249578396541152012-07-26T21:28:00.001-07:002012-07-26T21:42:03.636-07:00How Marketable Are You?<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div style="text-align: left;">
Programming Job Market Comparison Based on indeed.com Data</div>
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6eG5fKk91ijLiyBFDOBRIV3lKuezpo2XzURAZlkE3Ffoh8IAG3CLouw73PA0vHzB-Sk0IOl-tmbbFHeyEuVUe_w9nQRsy7F2hbq5rFGOA_isVVnGYLDisMSp2rMtCMOtiyK4WZUGJDa3r/s1600/programming+salary.tiff" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><br /></a><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3LSPyK7BT4uhwRNilIuGMLRFt32giBtJs-tVHeui76wCTzIRKHLdkpqvOeKC372tnGi73QwW2OSB2eE75O3F4AEYYJV07k6ampOgJuTuoihixcve0V_8y7Au8-XeCTJCzpqi8FJsCJFb5/s1600/programming+salary.tiff" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3LSPyK7BT4uhwRNilIuGMLRFt32giBtJs-tVHeui76wCTzIRKHLdkpqvOeKC372tnGi73QwW2OSB2eE75O3F4AEYYJV07k6ampOgJuTuoihixcve0V_8y7Au8-XeCTJCzpqi8FJsCJFb5/s1600/programming+salary.tiff" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div style="text-align: left;">
<br /><br />
Database Job Market Comparison Based on indeed.com Data</div>
<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQDUm4_UkLEhIV1zXGcb6omltOSEkxL-qHV2wwPatmfkd_UYqFgT5-9eyWRrVIftQyrxeFFiBRYE6DwUat9A62FwResb-9IKe51F2k6r1uFJ42hO60XC0WkzlY2hQVbvhrZoXR6D-0tTL4/s1600/database.tiff" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiQDUm4_UkLEhIV1zXGcb6omltOSEkxL-qHV2wwPatmfkd_UYqFgT5-9eyWRrVIftQyrxeFFiBRYE6DwUat9A62FwResb-9IKe51F2k6r1uFJ42hO60XC0WkzlY2hQVbvhrZoXR6D-0tTL4/s1600/database.tiff" /></a></div>
<br />
<div>
<br /></div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-75209516694474597822012-06-05T22:53:00.003-07:002013-06-06T22:16:08.048-07:00MVC now and then.<div>
The history of MVC can be traced back to the early 80's. It was a key component of Smalltalk.</div>
<div>
<br /></div>
<div>
In the past 10 years, MVC has became a standard way to write web applications. Take Java for example: the usage of Struts, a popular Apache project for writing web applications, reached its peak in 2005. I still remember that everyone with struts experience on their resume would easily get interviews around that time.<br />
<br />
Fast forward to year 2007/2008, RoR became a mainstream MVC framework, partially because it's made available on Mac. <br />
<br />
With the latest hype of HTML5 and JavaScript, a new breed of browser level MVC frameworks have emerged. The general concept is to treat client browsers as full "applications" instead of just frontend UIs or views. <b><u>AJAX/JSON is the new model, HTML and CSS are the view, and JavaScript is the controller. </u></b>There are several popular JS level MVC frameworks, SproutCore, Backbone.js, and etc. I haven't had a chance to play with them. But in all the frontend projects that I've done in the past 12 months, I definitely tried to push model and controller all the way to the browser level.<br />
<br />
Another revolution is that with transitioning MVC to the browser, the backend data layer is also evolved into its own MVC. <b><u>Backend data are still models, JSON data is the new view, and Java/php/C# or other backend logic is the controller.</u></b></div>
<div>
<br /></div>
CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com2tag:blogger.com,1999:blog-8401457208313287963.post-22497664134898019742012-06-05T22:37:00.001-07:002012-06-05T22:37:43.577-07:00TP99?<br />
<div>
There has been a cross organization initiative of defining and committing to TP99 based SLAs. Looking back at the post I did last Sept, I really wanted my team to understand our SLAs, and to communicate with clients using proper SLAs and monitoring tools.</div>
<div>
<br /></div>
<div>
Before this initiative, most of the teams track their performance (latency) using average processing time. The problem is that if performance has large variance, poor latency is hidden by mean or median. Max latency is also not very relevant. Imagining a Java based service that does GC once every 10 minutes and a full GC once every few hours, max latency only reflects the worst latency during GC.</div>
<div>
<br /></div>
<div>
That's why "top percentile" or TP based latency makes more sense. When you have 100 requests to your service, you can sort all the request time in ascending order. The 99th reuqest in the list is your TP99. </div>
<div>
<br /></div>
<div>
To design TP99 SLAs, you need to keep few things in mind:</div>
<div>
<ol>
<li>Define a time span -- You have to get latency data for every single request, sort them and find out the top 99th percentile data point. So you want to have a reasonable amount of data collected for the time span. If you have a low volume service which gets less than 100 request per 10 minutes, you do not want to define a 10 minutes based SLA. If you do, you'll hit all worst cases. If you have a very large volume service, you don't want define a daily TP99, because you'll end up hiding the real problem.</li>
<li>Watch out of extract code of logging information -- In order to calculate TP99, you have to log every single request. If the logging system is not designed properly, you actually might degrade the overall system performance by logging too much. My recommendation is to truly separate out core business logic and the operation/system level logic. So the application doesn't have to worry about logging or calculating. I've seen existing solution of logging into SQL Server. Perf data have very little or none relationships. So writing to SQL server just doesn't make any sense. I recommend to simply write to a local file or local service, and do offline/off hour data aggregation.</li>
</ol>
<div>
<br /></div>
</div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-6763268547999615472011-09-13T20:26:00.000-07:002011-09-13T20:41:38.345-07:00Update on the New JobTime flies. All the sudden, I'm well into the fourth week of my new job. Here is a quick update.<div><br /></div><div>Changes are difficult, job change is no exception. I have switched job many times in the past, and I found coming in as a dev manager is especially difficult. Here is a list I made for myself before I took the job.</div><div><ul><li>Technology</li></ul><ol><li>Code base</li><li>Infrastructure (system and hardware)</li><li>Production SLA and monitoring</li></ol><div><br /></div><div><ul><li>Process</li></ul><ol><li>Development life cycle</li><li>Deployment process</li><li>Troubleshooting process</li><li>Support and escalation</li><li>Any relevant company policies</li></ol></div><div><br /></div><ul><li>Domain knowledge</li></ul><ol><li>High level business logic for all key components</li><li>Product/service in relation to revenue</li></ol><div><br /></div><ul><li>My team</li></ul><div><ol><li>Skill set</li><li>Career goals</li><li>Interests</li></ol><div><br /></div></div><ul><li>Management</li></ul><div><ol><li>Who has influence over my team, directly or indirectly</li></ol><div><br /></div></div><ul><li>Peer teams</li></ul><ol><li>Whom my team need to work with</li><li>History between my team and peer teams</li></ol><div><br /></div></div><div>Checking my progress again the list, I still have a long way to go.</div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com3tag:blogger.com,1999:blog-8401457208313287963.post-6135355660987423962011-08-23T21:08:00.000-07:002011-08-23T21:18:10.128-07:00New BeginningStartup is exciting! Startup is hard! Startup is crazy!<div>After spending the past 4 years at Livemocha, I'm finally taking a break from the startup world. Yesterday, I started my new job at Expedia.</div><div>It's a new company, a new team, new technology stacks, and new business domains. It's going to be exciting, it's going to be hard, and it's also going to be crazy! </div><div>
<br /></div><div>Looking forward to the new beginning!</div><div> </div><div>
<br /></div><div>
<br /></div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com1tag:blogger.com,1999:blog-8401457208313287963.post-24947998934674468082011-08-01T11:07:00.001-07:002011-08-01T11:16:13.495-07:00More on Amazon CloudI recently started working more closer with Amazon Cloud. I got the opportunity to play Elastic Load Balancer and Relational Database Service. Here are some thoughts:<div><ul><li>ELB just works. It's very similar to the Amazon internal LB tool that I have used in the past, at least from the self-managing point of view.</li><li>I need to figure out how to add CNAME to ELB.</li><li>You can only each EC2 instance to one ELB, not multiple ELB. This has some limitation. For example, if I want to congiure virtual hosting in apache to use the same EC2 to serve two different web instance, I simply can't do it.</li><li>You can only do direct URL mapping (www.abc.com/efg => internal.server1/eft). You can't rewrite URLs on the LB level. Mod_rewrite is your friend. :P</li></ul><div><br /></div></div><div><ul><li>RDS is very easy to setup. However, to link EC2 to RDS is not as straightforward as it should be. You have to remember the name of the security group that you want to grant access to. Why not just use a dropdown list?</li><li>The setup process only supports one root admin account. I'm sure that once you create you database instance, you can create more users. But it's an extra step. You also can't create access rule with different users.</li></ul></div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-559959035121181312010-07-28T21:39:00.000-07:002010-07-28T22:04:12.674-07:00To "cloud" or not to "cloud"Lately I have interviewed many candidates for our engineering positions. A common question from the interviewees is always "why don't you host your service on the cloud"?<div>This is actually a question that we often ask ourselves. We love the idea of hosting all of the services on the cloud, so we don't have to manage <span class="blsp-spelling-error" id="SPELLING_ERROR_0">hardwares</span>. But why haven't we done so?</div><div><ol><li>We started <span class="blsp-spelling-error" id="SPELLING_ERROR_1">Livemocha</span> in early 2007. Cloud computing wasn't mature enough. The only cloud service out there was Amazon S3. We simply could not setup an entire DC on the cloud.</li><li>Better control of hardware specs. Most of the cloud computing service use <span class="blsp-spelling-error" id="SPELLING_ERROR_2">VM</span>. We still can't have full control of the hardware spec, number of <span class="blsp-spelling-error" id="SPELLING_ERROR_3">CPUs</span>, side of hard disk, speed of hard disk, memory size and etc.</li><li><span class="blsp-spelling-error" id="SPELLING_ERROR_4">NFS</span> solution. S3 is the most mature cloud file storage system. Up till today, it still can't replace the good old simple <span class="blsp-spelling-error" id="SPELLING_ERROR_5">NFS</span>. </li><ol><li>S3 can be mounted to multiple EC2 instances, but it's slow. You can't stream data to S3 drives.</li><li>There no good solution to backup S3 data. With tradition <span class="blsp-spelling-error" id="SPELLING_ERROR_6">NFS</span>, we can both hardware or software solutions to back an entire disk at real time.</li></ol><li>No LB support. Amazon just started offering LB last year. But its LB configuration is very simple. There are nothing much you can do besides simply round robin load balancing. We use F5 LB, which can be configured to do hardware based https acceleration, reverse proxy, and dynamic caching.</li></ol> Here is a list of things that we do use on the cloud</div><div><ol><li>EC2 computing on demand. If we want to generate tons of <span class="blsp-spelling-error" id="SPELLING_ERROR_7">PDFs</span> or video, we request new instances of EC2 and schedule jobs there.</li><li>S3 as secondary storage. We keep a copy of all user data on our <span class="blsp-spelling-error" id="SPELLING_ERROR_8">NFS</span>, then transfer duplicates to S3.</li><li><span class="blsp-spelling-error" id="SPELLING_ERROR_9">CloudFront</span>. <span class="blsp-spelling-error" id="SPELLING_ERROR_10">CloudFront</span> is awesome. It's cheap, and it's faster. </li><li><span class="blsp-spelling-error" id="SPELLING_ERROR_11">SQS</span>. We have more than 1000 queues running in <span class="blsp-spelling-error" id="SPELLING_ERROR_12">SQS</span>. They are persistent, and <span class="blsp-spelling-corrected" id="SPELLING_ERROR_13">guaranteed</span> delivery. </li><br /></ol></div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-28210707390037532772010-07-14T21:29:00.000-07:002010-07-14T21:37:33.432-07:00More on CakePHP performance related to localizationIt's been a while since I posted CakePHP performance related tips here. As a matter of fact, it's been a while since I posted anything here. Last year, I spent a lot of time tuning queries and adding code instruments in our system to troubleshoot performance bottlenecks. Here's more on CakePHP:<div><ol><li>Cache all your PO files. CakePHP loads localization strings from file system. It's sloooooow. Use APC to cache your PO files. Hack it in i18n.</li><li>Cache fallback logics in localization. CakePHP has a fairly complex structure to determine which language it should use to display for a localized text. It makes multiple file reads and in memory lookups before it can determine a display language from browser's supported language header. Hack it in i10n.</li><li>Cache the entire view if you can. If the page is static, you only use CakePHP to do localization, cache the entire page. You need to hack view caching and dispatcher code.</li></ol><div>If I have to quantify it, step 3 is the biggest performance gain, 300%+ faster, step 1 is the second on the list, 20% to 50%, step 2 should give you another 10% 30% improvements.</div></div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-25864931955085580942010-01-21T21:12:00.000-08:002010-01-21T21:29:07.211-08:00More fun with CakePHPLivemocha has been using CakePHP for almost 3 years now. We have done multiple iterations of performance improvements. In past posts, I mentioned CakePHP's problem with some view helper classes. Now, I want to cover few more finds.<br />1. Initializing models takes forever.<div>I don't have the exact number on hand. When I ran a profiler on our home page, which is pretty much static except a few logic around session and auto login handling, about 30% of the time were taken by Cake trying to initialize models. After digging around a bit, I found out that instead of listing all necessary models used in the controller in $this->uses, I should initialize them individually in each action. </div><div>2. A lot of caching code are just not optomized</div><div>Latest Cake (1.2) support in memory cache such as APC. However, old Cake only suppose local file caching. When Cake 1.2 was released, majority of the Cake framework code still just use the old file based caching system. This includes model caching and view caching. Disk seed is extremely slow comparing to memory lookup, it's even slower than network IO if the data size is small (under 2 K). A lot of people have ran those tests before, I am not going to list the results here. I modified Cake code to have it to use APC for caching models and views. </div><div>3. <cakephp-nocache> is almost useless as it is</div><div>It sounds like a great idea to be able mark a portion of the view to be not cacheable while the rest of the view is cached. However, if you read the fine print in Cake document, you'll see that controller action is not executed if the view is cached. It makes sense since Cake just bypass the controller code and go straight to the cached view. If this is the case, what is the need of having this noncache block? You can't supply dynamic data to a view, then why does the view need to be dynamic? A little thing that one can try is to add some code to retrieve data in beforeFilter, which is executed if the callback flag is set to true for a view cache. However, nobody would push any db logic in beforeFilter. Again, I updated Cake code to take a list of callback functions and execute them before it tries to render the view. This way, data retrieving can be done in small functions and can be share in the controller.</div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-8831547431080622792009-05-04T21:04:00.000-07:002009-05-04T21:24:56.051-07:00SimpleDB Batch writeAbout 3 weeks ago, we started using <span class="blsp-spelling-error" id="SPELLING_ERROR_0">SimpleDB</span> to store user activity feeds. On average, we generate between 10 to 15 feeds per seconds. To make the feed generation process <span class="blsp-spelling-corrected" id="SPELLING_ERROR_1">asynchronous</span>, we send messages to SQS from our datacenter. There are three EC2 instances would then process SQS. <div><br /><div>When the system was designed and deployed, SimpleDB just started offering "batch write" operation. However, the PHP client did not support it at the time. We had to send one message at a time. The write performance of Simple DB is pretty bad. We were only able to get 1 or 2 writes per seconds. In order to catch up with the feed generation speed, we have 5 concurrent processes running on each EC2 instance and triggered by crontab once every minute. Each process only writes up to 100 records in between 1 to 2 minutes. So the average write time is about 1 second.</div><div><br /></div><div>Last Friday, I noticed that our SQS queue is backed up from our monitoring tool. When I checked on our EC2 instances, I noticed that the load on each instance were way too high. Somehow the write did not perform as fast as we planed, so all the processes queued up.</div><div><br /></div><div>Today, I switched to "batch write" operation. The performance is really good. Each individual operation only takes about 1 second, but each operation can write up to 25 records. Now, a PHP process can generate up to 1000 records in 30 seconds. The average write time is 0.03 second.</div><div><br /></div><div><br /></div><div><br /></div><div><br /></div></div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com2tag:blogger.com,1999:blog-8401457208313287963.post-12735445805012025522009-02-12T22:13:00.000-08:002009-02-12T22:31:02.838-08:00NFS to S3/SQS/CFIn the last 18 months, our site has been running on <span class="blsp-spelling-error" id="SPELLING_ERROR_0">NFS</span>, all user images, audios are stored on an <span class="blsp-spelling-error" id="SPELLING_ERROR_1">NFS</span> server that's mounted on each web hosts. So far, we have 5.8 millions files.<div><br /></div><div>Sometime last year, we realized that we made a huge mistake in designing the <span class="blsp-spelling-error" id="SPELLING_ERROR_2">NFS</span> file structure. We only had two directories for storing all user generated files, one for image, one for audio. With more than a million files stored in a single directory, a simple "ls" command takes hours or even days. It could even bring the <span class="blsp-spelling-error" id="SPELLING_ERROR_3">NFS</span> server down to its knee if we try to delete or move files around. We made some changes to audio recording so at least audio files are stored in a hierarchy. </div><div><br /></div><div>Another <span class="blsp-spelling-corrected" id="SPELLING_ERROR_4">incident</span> happend last week. Our primary NFS went down dur to some driver issue. The backup NFS didn't kick in in time so we had few hours of down time on the production site. We finally decided to move to Amazon S3 and start with user images only.</div><div><br /></div><div>The design is simple. User still uploads images to our own NFS server. When the file upload job is done, we then send a message to SQS to indicate that a user has uploaded an email. A cron job listens to the SQS queue. When the cron job is executed and it finds a message in SQS, it then uploads the file from our NFS to S3 and change the file pointer in the DB. Our view layer code uses the pointer to determine where the file is located, either on NFS or in S3. Since we also enabled CouldFront, view layer code will then use our CloudFront URLs to render the image.</div><div><br /></div><div>There are several key decisions in this practice:</div><div>1. We create a backup queue in our DB in case SQS is down. Chances are that it's more likely we own site goes down before SQS goes down, but we just want to be ready.</div><div>2. We already tries to write to SQS first. If SQS goes down, we write to our DB queue as long as there are something in the DB queue.</div><div>3. There's a cron job that move messages from DB queue to SQS. So we make sure the order are maintained (SQS doesn't really guarantee orderings, but we'll do our best).</div><div>4. The cron job that listens to SQS only need to worry about SQS, not the DB queue.</div><div>5. We create 2 cnames for our CloudFront and randomly pick one at a time when we render an image. This will allow browers to utilize additional threads to retreive data (most of browers only allow up to 4 connections to the same server).</div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-39526367187459202072009-02-08T22:01:00.000-08:002009-11-21T20:54:32.705-08:00CakePHP requestAction<span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;"><div>We are currently undergoing another round of site optimization. This time, I used APD to profile our sites. Since the home page is the most important page, I started from profiling it.</div><div><br /></div>Total Elapsed Time = 1.74<br />Total System Time = 0.51<br />Total User Time = 1.18<br /><br /><br /> Real User System secs/ cumm<br />%Time (excl/cumm) (excl/cumm) (excl/cumm) Calls call s/call Memory Usage Name<br />------------------------------<wbr>------------------------------<wbr>--------------------------<br />77.8 0.00 1.35 0.00 0.89 0.00 0.39 21 0.0000 0.0645 0 View->_render<br />77.8 0.00 1.35 0.00 0.89 0.00 0.39 23 0.0001 0.0589 0 include<br />69.1 0.00 1.20 0.00 0.83 0.00 0.36 1 0.0001 1.2032 0 main<br />64.0 1.11 1.11 0.76 0.76 0.33 0.33 1 1.1136 1.1136 0 apd_set_pprof_trace<br />48.1 0.00 0.84 0.00 0.55 0.00 0.24 20 0.0001 0.0419 0 View->renderElement<br /><span class="Apple-style-span" style="font-weight: bold;"><span class="Apple-style-span" style="color: rgb(255, 0, 0);">30.4 0.00 0.53 0.00 0.35 0.00 0.15 1 0.0001 0.5289 0 View->renderLayout</span></span><br /><span class="Apple-style-span" style="font-weight: bold;"><span class="Apple-style-span" style="color: rgb(255, 0, 0);">27.7 0.00 0.48 0.00 0.31 0.00 0.14 2 0.0001 0.2410 0 View->requestAction</span></span><br />27.7 0.00 0.48 0.00 0.31 0.00 0.14 2 0.0001 0.2407 0 Dispatcher->dispatch<br />9.4 0.00 0.16 0.00 0.11 0.00 0.05 1 0.0001 0.1636 0 MessagesController-><wbr>constructClasses<br />8.7 0.00 0.15 0.00 0.09 0.00 0.06 1 0.0001 0.1513 0 BuddiesController-><wbr>constructClasses<br />7.0 0.00 0.12 0.00 0.10 0.00 0.02 6 0.0000 0.0202 0 call_user_func_array<br />6.8 0.02 0.12 0.01 0.08 0.01 0.04 508 0.0000 0.0002 0 Set->extract<br />6.2 0.00 0.11 0.00 0.07 0.00 0.01 2 0.0000 0.0543 0 Dispatcher->_invoke<br />5.9 0.05 0.10 0.03 0.06 0.04 0.04 2242 0.0000 0.0000 0 low<br />5.9 0.00 0.10 0.00 0.06 0.00 0.04 38 0.0000 0.0027 0 array_map<br />5.2 0.09 0.09 0.06 0.06 0.02 0.02 1305 0.0001 0.0001 0 handleError<br />5.0 0.00 0.09 0.00 0.04 0.00 0.00 2 0.0000 0.0433 0 MessagesController-><wbr>beforeFilter<br />4.3 0.00 0.08 0.00 0.04 0.00 0.01 2 0.0000 0.0377 0 Dispatcher->start<br />4.3 0.00 0.07 0.00 0.05 0.00 0.02 125 0.0000 0.0006 0 m<br />3.5 0.03 0.06 0.00 0.03 0.00 0.00 4 0.0068 0.0154 0 MochaLoggerComponent->debug<br /><br /></span><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;">The highlighted lines show that the rendering takes a long time. And the time is mostly spent on requestAction. As I recall, CakePHP requestAction is really slow. It's just so happen that we have two requestAction calls in our layout, which is invoked in every single request.</span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;"><br /></span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;">Instead of using requestAction to retreive data in the layout, I added DB call in AppController and pass the data to the layout. The result is stunning. </span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;">Total Elapsed Time = 1.29<br />Total System Time = 0.38<br />Total User Time = 0.89<br /><br /><br /> Real User System secs/ cumm<br />%Time (excl/cumm) (excl/cumm) (excl/cumm) Calls call s/call Memory Usage Name<br />------------------------------<wbr>------------------------------<wbr>--------------------------<br />95.6 0.00 1.23 0.00 0.85 0.00 0.36 1 0.0001 1.2346 0 main<br />88.6 1.14 1.14 0.79 0.79 0.33 0.33 1 1.1441 1.1441 0 apd_set_pprof_trace<br />11.4 0.00 0.15 0.00 0.10 0.00 0.04 21 0.0000 0.0070 0 View->_render<br />10.6 0.00 0.14 0.00 0.09 0.00 0.04 21 0.0001 0.0065 0 include<br />8.5 0.00 0.11 0.00 0.08 0.00 0.03 20 0.0001 0.0055 0 View->renderElement<br />5.9 0.00 0.08 0.00 0.05 0.00 0.03 123 0.0000 0.0006 0 m<br />4.8 0.00 0.06 0.00 0.04 0.01 0.02 123 0.0000 0.0005 0 __<br /><span class="Apple-style-span" style="font-weight: bold;"><span class="Apple-style-span" style="color: rgb(255, 0, 0);">3.7 0.00 0.05 0.00 0.03 0.00 0.02 1 0.0001 0.0484 0 View->renderLayout</span></span><br />2.8 0.00 0.04 0.00 0.02 0.00 0.01 123 0.0000 0.0003 0 I18n::translate<br />2.3 0.03 0.03 0.02 0.02 0.01 0.01 403 0.0001 0.0001 0 handleError<br />1.1 0.00 0.01 0.00 0.01 0.00 0.01 27 0.0000 0.0005 0 imgsrc<br />0.7 0.00 0.01 0.00 0.01 0.00 0.01 27 0.0000 0.0004 0 getUrlHashByte<br />0.7 0.00 0.01 0.00 0.00 0.00 0.00 2 0.0000 0.0043 0 FormHelper->select<br />0.6 0.01 0.01 0.01 0.01 0.00 0.00 582 0.0000 0.0000 0 ord<br />0.6 0.00 0.01 0.00 0.01 0.00 0.00 6 0.0002 0.0013 0 require_once<br />0.6 0.01 0.01 0.00 0.00 0.00 0.00 123 0.0001 0.0001 0 debug_backtrace<br />0.6 0.00 0.01 0.00 0.01 0.00 0.00 14 0.0000 0.0005 0 Version::link<br />0.5 0.00 0.01 0.00 0.00 0.00 0.00 13 0.0000 0.0005 0 VersionHelper->link<br />0.5 0.00 0.01 0.00 0.01 0.00 0.00 1 0.0000 0.0071 0 LanguageListHelper-><wbr>nameFromCode<br />0.5 0.01 0.01 0.01 0.01 0.00 0.00 235 0.0000 0.0000 0 is_array<br /></span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;"><br /></span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;"><br /></span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;">The first time around, it took 1.74s, excluding the 1.11s on the profiling call, the request was 0.63s.</span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;">The second time, it took 1.29s, excluding the 1.14s on the profiling call, the request was 0.15s.</span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;"><br /></span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;">(0.63-0.15)/0.63 = 77%. </span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;"><br /></span></div><div><span class="Apple-style-span" style="border-collapse: collapse; font-family:arial;font-size:13px;">Of course, this calculation is not all that scientific. I would have to run a controlled stress time over a long period of time to get an accurate result. However, based on the simple number difference on the render layour call, I think it's safe to say removing requestAction will pay off.</span></div>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-37705128320304038712008-10-11T17:50:00.000-07:002008-10-26T01:10:45.143-07:00Ubuntu LAMP1. Install packages<br />$ sudo tasksel install lamp-server<br />$ sudo apt-get install php5-curl<br />$ sudo pear install crypt_hmac<br /><br />2. Subversion<br />sudo apt-get install subversion<br /><br />3. Create a svn project<br />$ sudo mkdir /home/svn<br />$ cd /home/svn<br />$ sudo mkdir curvebreaker<br />$ sudo svnadmin create /home/svn/curvebreaker<br />$ sudo chown -R www-data curvebreaker<br />$ sudo chgrp -R subversion curvebreaker<br />$ sudo chmod -R g+rws curvebreaker<br /><br />4. Enable http access for subversion<br />$ sudo vi /etc/apache2/mods-available/dav_svn.conf<br /> Add following lines:<br /> <Location /svn><br /> DAV svn<br /> SVNParentPath /home/svn/<br /> AuthType Basic<br /> AuthName "Subversion Repository"<br /> AuthUserFile /etc/apache2/dav_svn.passwd<br /> <LimitExcept GET PROPFIND OPTIONS REPORT><br /> Require valid-user<br /> </LimitExcept><br /> </Location><br />$ sudo htpasswd -c /etc/apache2/dav_svn.passwd xiaoj<br />$ sudo apache2ctl restart<br /><br />5. Create trunk in subversion<br />$ cd ~<br />$ mkdir -p workplace/trunk<br />$ cd workplace<br />$ svn co localhost/svn/curvebreaker trunk<br />$ svn add trunk<br />$ svn commit -m "create trunk" trunkCodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-31991032648077552372008-10-08T19:39:00.000-07:002008-10-09T08:23:58.230-07:00Install ImageMagick & RMagick on Ubunto 8.0.4#1. Get ImageMagick<br />$ sudo aptitude update<br />$ sudo aptitude install libmagick9-dev<br /><br />#2. Get RMagick<br />$ sudo gem install rmagick<br /><br />#3. Test<br />$ irb -rubygems -r RMagick<br />irb(main):001:0> puts Magick::Long_version<br />This is RMagick 2.7.0 ($Date: 2008/09/28 00:23:10 $) Copyright (C) 2008 by Timothy P. Hunter<br />Built with ImageMagick 6.3.7 02/19/08 Q16 http://www.imagemagick.org<br />Built for ruby 1.8.6<br />Web page: http://rmagick.rubyforge.org<br />Email: rmagick@rubyforge.org<br />=> nilCodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-87273117140169825012008-03-30T22:34:00.000-07:002008-03-30T23:16:45.035-07:00Generic retry target in ANTI think the latest ANT trunk code has a new retry target. I wrote my own retry target using ANT 1.7 and ANT-contrib. It's kinda hackish, but it works well.<br /><span style="font-size:85%;"> <target name="retry-task"><br /><var name="task.return" value="false" /><br /><for list="0,1,2" param="retry"><br /> <sequential><br /> <if><br /> <equals arg1="${task.return}" arg2="false" /><br /> <then><br /> <trycatch><br /> <try><br /> <var name="task.return" value="true" /><br /> <antcall target="${retry-target}"/><br /> </try><br /> <catch><br /> <var name="task.return" value="false" /><br /> </catch><br /> </trycatch><br /> <echo message="return ${task.return}" /><br /> </then><br /> <else><br /> </else><br /> </if>><br /> </sequential><br /> </for><br /> <if><br /> <equals arg1="${task.return}" arg2="false" /><br /> <then><br /> <fail message="${retry-target} failed after 3 retries" /><br /> </then><br /> </if><br /></target><br /><br />To use this target:<br /></span><span style="font-size:85%;"> <</span><span style="font-size:85%;">antcall target="retry-task"</span><span style="font-size:85%;">></span><br /><span style="font-size:85%;"> <</span><span style="font-size:85%;">param name="retry-target" value="some-other-ant-task" /</span><span style="font-size:85%;">></span><br /><span style="font-size:85%;"> <</span><span style="font-size:85%;">/antcall</span><span style="font-size:85%;">></span>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com3tag:blogger.com,1999:blog-8401457208313287963.post-78550808984747791782008-02-29T12:00:00.000-08:002008-02-29T12:01:11.606-08:00testtestCodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-41453122694371148462007-11-04T19:25:00.000-08:002007-11-04T19:46:21.590-08:00MochaLogger Alpha ReleasePHP is an interesting language. It's designed specifically for writing web sites. Furthermore, it's designed to do mostly presentation layer work. CakePHP follows Ruby on Rails' framework. It has a somewhat well defined MVC framework, good scaffolding functions and alright performance. PHP as a presentation layer language allows the system logs to be sent to client's browser. A developer can use browser to debug server side problem quickly. CakePHP also provides this function.<br /><br />Displaying system logs on the browser is very useful during dev testing. However, in real production environment, nobody wants to expose any system logs to the end users. CakePHP's file logging system is poorly implemented. It only offers two levels of logs: DEBUG or ERROR. Furthermore, debug logs and error logs are saved to two different files. This makes troubleshooting system problem by reading logs very difficult. <br /><br />I'm a long time Java user. Log4J is a great logging system. It allows users to write to the same or multiple log files with DEBUG, INFO, WARN, ERROR or FATAL level log messages. With Log4J, a developer can add as detailed as needed DEBUG logs to his code without worrying about printing out too much information on the production server. He can simply set the production server's log level to WARN above only. A developer can even configure different log level for different Java classes.<br /><br />In order to log more meaningful information in our system, I implemented MochaLogger. It's a plugin into CakePHP. It offers basic logging functions with 5 different levels of logs. It has an UI that allows users to configure logging levels for each PHP file/object at run time. MochaLogger is open sourced on CakeForge: <a href="http://cakeforge.org/frs/?group_id=215">http://cakeforge.org/frs/?group_id=215</a>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-9631330154686551482007-09-25T09:21:00.000-07:002007-09-25T09:23:49.859-07:002000+ new user registrations in day 1<span class="blsp-spelling-error" id="SPELLING_ERROR_0">Livemocha</span> survived day 1 with 2000+ new user registrations. We <span class="blsp-spelling-corrected" id="SPELLING_ERROR_1">closely</span> monitored the CPU, memory, <span class="blsp-spelling-error" id="SPELLING_ERROR_2">NIC</span> usages on all our servers. It's time to add more machines.<br />Come to visit us at: <a href="http://www.livemocha.com/">http://www.livemocha.com</a>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-68014822475997452862007-09-13T21:43:00.000-07:002007-09-13T22:05:23.095-07:00CakePHP Performance TuningUse <span class="blsp-spelling-error" id="SPELLING_ERROR_0">JMeter</span>, I created a load test with 5 simultaneous users hitting our <span class="blsp-spelling-error" id="SPELLING_ERROR_1">login</span>, home and <span class="blsp-spelling-error" id="SPELLING_ERROR_2">logout</span> pages. Each user had between 1 to 2 seconds delay between each step during the load test. This yelled a 3.5 request/second throughput on the server. <br />Before the basic performance tuning, our server response time for the home page is about 2,600ms per request. We did following tuning and reduce the page response time to 9ms.<br />1. Enable <span class="blsp-spelling-error" id="SPELLING_ERROR_3">APC</span> -- Alternative <span class="blsp-spelling-error" id="SPELLING_ERROR_4">PHP</span> Cache caches <span class="blsp-spelling-error" id="SPELLING_ERROR_5">opcode</span> and reduce interpreter workload during <span class="blsp-spelling-error" id="SPELLING_ERROR_6">runtime</span>. This reduce the response time by 10%.<br />2. Enable <span class="blsp-spelling-error" id="SPELLING_ERROR_7">CakePHP</span> cache -- <span class="blsp-spelling-error" id="SPELLING_ERROR_8">CackePHP</span> can cache a controller's result, so when a controller is requested, the view will directly render the result instead of going to the database. We simply enabled the view without make any changes to the code. This actually <span class="blsp-spelling-corrected" id="SPELLING_ERROR_9">eliminated</span> all the table describe code. Again, this step yelled 10% performance gain.<br />3. html->link function -- This is the most interesting part of this performance tuning. K.S. wrote an awesome code to profile CakePHP code. He found out that each html->link functions takes 50 to 100ms to return. When we render many links on a page, the time would add up. Instead of using fancy CakePHP link function, we just write plain HTML code for links. This gave us almost 50% performance gain.CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com6tag:blogger.com,1999:blog-8401457208313287963.post-3026665859349772342007-07-24T13:36:00.001-07:002007-07-24T13:36:04.661-07:00FlickrThis is a test post from <a href="http://www.flickr.com/r/testpost"><img alt="flickr" src="http://www.flickr.com/images/flickr_logo_blog.gif" width="41" height="18" border="0" align="absmiddle" /></a>, a fancy photo sharing thing.CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-76077889116336996172007-07-20T08:18:00.000-07:002007-07-20T08:32:41.478-07:00Use Ant to deploy to remote hostsAnt is an awesome compile/package/deployment tool for Java code. But it's not limited to Java programs only. Currently, I'm working on a PHP site. We actually use ant to deploy our app. I don't really want to list the details here, but I can explain the basic steps.<br />1. Staging<br />During this step, you use ant <span style="font-weight: bold; font-style: italic;">filter </span>feature to customize all the configuration and resources. For example, you might have different database for different environment. Developer's web server talks to Dev database, QA's web server talks to QA database, and prod's web server talks to Prod database. You use ant copy and filter feature to stage all the file from source location to a target location.<br />2. Packaging<br />Simple use ant <span style="font-weight: bold; font-style: italic;">tar </span>and zip feature to package everything into a compressed file. You might want to add times stamp or version number to the tar file so you know which version you are deploying.<br />3. Deployment<br />To transfer the application package to a remote host, you should use ant <span style="font-weight: bold; font-style: italic;">scp </span>task. Since we have multiple web servers, we also use ant <span style="font-weight: bold; font-style: italic;">for </span>task which tokenize a list of host names and create a for loop so the embedded scp task will copy files to each remote host. On the remote hosts, you should have a directory for holding all the app packages. You can use <span style="font-weight: bold; font-style: italic;">sshexec </span>to remotely create any directories.<br />4. Installation<br />You can easy write a shell script and deploy the shell script along with the application package or use ant <span style="font-weight: bold; font-style: italic;">sshexec </span>to unpackage the application. However, I highly recommend the use of a separated shell script. Becuase you won't have to relay on ant to install your software which means you can install/rollback/rollforward your production server from anywhere.CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0tag:blogger.com,1999:blog-8401457208313287963.post-66837894636380284172007-07-11T18:13:00.000-07:002007-07-11T18:22:51.198-07:00Ant JDBC and UTF8You have to really watch out when you want to use UTF8 for your database, enabling unicode support in database, creating table with the correct encoding settings for each columns and etc. There one more catch I just realized as I was trying to use ant JDBC to load some Chinese characters into our database.<br />You need to specify UTF8 and unicode in the JDBC connection string.<br /><sql driver="com.mysql.jdbc.Driver" url="jdbc:mysql://${db.host}?useUnicode=true&characterEncoding=UTF-8" userid="${db.user}" password="${db.password}"$gt;<br /> <transaction src="${target.dir}/database/script.mysql"/><br /></sql>CodeSithhttp://www.blogger.com/profile/16825444397064317749noreply@blogger.com0