CodeSith's Blog

Why is it important to follow "open standards"

2013-05-22T21:55:00.002-07:00

An open standard is a standard that is publicly available and has various rights to use associated with it, and may also have various properties of how it was designed (e.g. open process). There is no single definition and interpretations vary with usage. --- Wikipedia

Why do I promote "open standards" in my team?

1. To avoid reinventing the wheel.

If a public solution works well enough that many people have already adopted to it, there's no reason to rebuild it. For example, major programming languages have already implemented merge sort, or quick sort. I would not want my developers to rewrite those sort algorithms other than during interviews.

2. Free test coverages.
Open standards are usually used by many individuals and organization. If I write a library function that closes file handles "silently", I have to write a lot of unit tests. The test team will have to write a lot of functional tests. We all have to run tests in multiple environments (linux, mac, windows, etc). We also need to create many stress testcases to cover failures. Apache common IO util offers this function, which has been tested and used by many developers. I don't need to test it as much.

3. Open standards usually work well with each other.
Unlike "proprietary standard", "open standard" is meant to be shared. Therefore, many open standards work very similarly. Recently, one of my devs ported a traditional web services application over to a smart compute grid infrastructure. He made less than 500 lines of code change , only because both of the original web services container and the compute grid share the same "open standard".

4. Open standards expand one's horizon.

A lot of developers who have been working with certain proprietary technologies are fairly narrow-minded. I have seen enough .Net developers who believe that every data storage related problem should be solved by SQLServer. And many C#-only developers do not know IoC (inverse of control). SQL Server is a solid relational DB, and C# is a useful language. However, there are plenty of alternative open technologies are equally good, or even better in certain cases.

5. Knowing open standards make you more marketable.
For all the reasons above, and from my previous post, everyone should learn open standards to make them more marketable.

Test in Production!

2013-05-05T17:51:00.002-07:00

Anyone has a problem with the picture above? I certainly don't! However, I'll add this, "when I test in production, I test it carefully".

In software service development, testing is so critical. Services often go through multiple tiers of testing environments before the bits are finally released to the customer. But why can't we do a one-stop testing directly on production?

The typical answer to this question is "NO, YOU CANNOT IMPACT CUSTOMERS!". This is because the traditional practice of "testing on production" is to take a slice of the production traffic, and to put a pre-release version service behind the VIP along with all other current version services. This way, only a small portion of customers are impacted. But some customers ARE impacted!

This is where "port forwarding" comes in handy. To ensure that NO customers are impacted at all, we can "clone" a slice of the production traffic, and send them over to a pre-release version along with all current version services. The cloned traffic is only one-way: it goes into the pre-release version service, but never returns back to the customer. This way, NO customers are impacted by the behavior of "un-tested" pre-release service.

What do we get out of this?

Free stress test! Instead of trying to setup stress testing environment that simulates production volumes and requests patterns, you will have a service running in production, getting production requests and throughput.
Free regression test! If you compare the responses from both current version service and pre-release version service, you get yourself a simple regression test suite.
Frequent regression test! Hook this into your favorite CI framework, you get to run regression tests 24/7, given you have production traffic 24/7.

There are multiple ways to do "port forwarding", advanced load balancers have built-in features to clone a subset of requests. If you are running linux, you can just configure IPtables to forward specific ports. Since I'm not a sysadmin, I prefer software solutions that I can modify and manage.

Node.js has a nice plugin node-proxy. You can run a proxy service that bridge traffic to a "target" service, and a "forwarding" service. The target service is the one that handles real traffic, its responses are sent back to the customer through proxy. The forwarding service is only one-way. It gets the same requests as the target service, but never returns anything back to the customer. With this setup, you can TEST on PRODUCTION!

Create FilterChain in node.js

2012-12-23T21:58:00.002-08:00

I often try to learn something interesting (mostly programming related) whenever I get a long break from work. Last Thanksgiving, I wrote an iPhone app that syncs photo among S3, Flickr and Facebook. This Xmas, I took on writing my first node.js app.

I used http://www.nodebeginner.org/ as a starting point. 5 minutes into the tutorial, I encountered a strange problem. For every request sent from chrome browser to my node.js server, my service recorded TWO requests. A quick google search indicated that chrome ALWAYS sends an additional "/favicon.ico" request to an HTTP server if it cannot locate an icon for that server.

This was really annoying because it messed my global debugging counter. The solution was simple: just ignore all requests in the format of "/favicon.ico". But "SIMPLE" solutions are no fun, especially during learning process. If I were to do this in java, I'd use ServletFilter to "preFilter" out all unqualified requests. So I put my javascript and java skills to the test, and wrote this simple FilterChain function in node.js. Enjoy!

var http = require("http");

* Manage all filters.

var filterChain = {

filters: new Array(),

add: function(filter) {

this.filters.push(filter);

applyAll: function(request, response) {

this.apply(request, response, 0);

apply: function(request, response, i) {

if (i == this.filters.length) {

return processRequest(request, response);

}

var filter = this.filters[i];

// call preFilter and exits if fails

console.log(filter.name + ".preFilter");

success = filter.preFilter(request, response);

if (!success) {

return false;

}

// call next filter and exits if fails

success = this.apply(request, response, i+1);

if (!success) {

return false;

}

// call postFilter and exits

console.log(filter.name + ".postFilter");

success = filter.postFilter(request, response);

return success;

}

* Filters

* All filters must implement 3 things:

* name - String unique name for this filter

* preFilter() - Executed before processing request

* postFilter() - Executed after processing request

var faviconFilter = {

name: "favicon",

preFilter: function(request, response) {

if (request.url === '/favicon.ico') {

response.writeHead(200, {'Content-Type': 'image/x-icon'} );

response.end();

console.log('favicon request, filtered out!');

return false;

} else {

return true;

}

postFilter: function(request, response){return true;}

};

var latencyFilter = {

name: "latency",

timer: null,

preFilter: function(request, response) {

this.timer = process.hrtime();

return true;

postFilter: function(request, response) {

diff = process.hrtime(this.timer);

console.log("<%s>%ds%dns", request.url, diff[0], diff[1]);

return true;

}

};

filterChain.add(latencyFilter);

filterChain.add(faviconFilter);

* This is the actual function that processes the request

function processRequest(request, response) {

response.writeHead(200, {"Content-Type": "text/plain"});

response.write("Hello World");

console.log("Response send.");

response.end();

return true;

}

function onRequest(request, response) {

filterChain.applyAll(request, response);

}

http.createServer(onRequest).listen(8888);

console.log("Server has started.");

How Marketable Are You?

2012-07-26T21:28:00.001-07:00

Programming Job Market Comparison Based on indeed.com Data

Database Job Market Comparison Based on indeed.com Data

MVC now and then.

2012-06-05T22:53:00.003-07:00

The history of MVC can be traced back to the early 80's. It was a key component of Smalltalk.

In the past 10 years, MVC has became a standard way to write web applications. Take Java for example: the usage of Struts, a popular Apache project for writing web applications, reached its peak in 2005. I still remember that everyone with struts experience on their resume would easily get interviews around that time.

Fast forward to year 2007/2008, RoR became a mainstream MVC framework, partially because it's made available on Mac.

With the latest hype of HTML5 and JavaScript, a new breed of browser level MVC frameworks have emerged. The general concept is to treat client browsers as full "applications" instead of just frontend UIs or views. AJAX/JSON is the new model, HTML and CSS are the view, and JavaScript is the controller. There are several popular JS level MVC frameworks, SproutCore, Backbone.js, and etc. I haven't had a chance to play with them. But in all the frontend projects that I've done in the past 12 months, I definitely tried to push model and controller all the way to the browser level.

Another revolution is that with transitioning MVC to the browser, the backend data layer is also evolved into its own MVC. Backend data are still models, JSON data is the new view, and Java/php/C# or other backend logic is the controller.

TP99?

2012-06-05T22:37:00.001-07:00

There has been a cross organization initiative of defining and committing to TP99 based SLAs. Looking back at the post I did last Sept, I really wanted my team to understand our SLAs, and to communicate with clients using proper SLAs and monitoring tools.

Before this initiative, most of the teams track their performance (latency) using average processing time. The problem is that if performance has large variance, poor latency is hidden by mean or median. Max latency is also not very relevant. Imagining a Java based service that does GC once every 10 minutes and a full GC once every few hours, max latency only reflects the worst latency during GC.

That's why "top percentile" or TP based latency makes more sense. When you have 100 requests to your service, you can sort all the request time in ascending order. The 99th reuqest in the list is your TP99.

To design TP99 SLAs, you need to keep few things in mind:

Define a time span -- You have to get latency data for every single request, sort them and find out the top 99th percentile data point. So you want to have a reasonable amount of data collected for the time span. If you have a low volume service which gets less than 100 request per 10 minutes, you do not want to define a 10 minutes based SLA. If you do, you'll hit all worst cases. If you have a very large volume service, you don't want define a daily TP99, because you'll end up hiding the real problem.
Watch out of extract code of logging information -- In order to calculate TP99, you have to log every single request. If the logging system is not designed properly, you actually might degrade the overall system performance by logging too much. My recommendation is to truly separate out core business logic and the operation/system level logic. So the application doesn't have to worry about logging or calculating. I've seen existing solution of logging into SQL Server. Perf data have very little or none relationships. So writing to SQL server just doesn't make any sense. I recommend to simply write to a local file or local service, and do offline/off hour data aggregation.

Update on the New Job

2011-09-13T20:26:00.000-07:00

Time flies. All the sudden, I'm well into the fourth week of my new job. Here is a quick update.

Changes are difficult, job change is no exception. I have switched job many times in the past, and I found coming in as a dev manager is especially difficult. Here is a list I made for myself before I took the job.

Technology

Code base
Infrastructure (system and hardware)
Production SLA and monitoring

Process

Development life cycle
Deployment process
Troubleshooting process
Support and escalation
Any relevant company policies

Domain knowledge

High level business logic for all key components
Product/service in relation to revenue

My team

Skill set
Career goals
Interests

Management

Who has influence over my team, directly or indirectly

Peer teams

Whom my team need to work with
History between my team and peer teams

Checking my progress again the list, I still have a long way to go.

New Beginning

2011-08-23T21:08:00.000-07:00

Startup is exciting! Startup is hard! Startup is crazy!

After spending the past 4 years at Livemocha, I'm finally taking a break from the startup world. Yesterday, I started my new job at Expedia.

It's a new company, a new team, new technology stacks, and new business domains. It's going to be exciting, it's going to be hard, and it's also going to be crazy!

Looking forward to the new beginning!

To "cloud" or not to "cloud"

2010-07-28T21:39:00.000-07:00

Lately I have interviewed many candidates for our engineering positions. A common question from the interviewees is always "why don't you host your service on the cloud"?

This is actually a question that we often ask ourselves. We love the idea of hosting all of the services on the cloud, so we don't have to manage hardwares. But why haven't we done so?

We started Livemocha in early 2007. Cloud computing wasn't mature enough. The only cloud service out there was Amazon S3. We simply could not setup an entire DC on the cloud.
Better control of hardware specs. Most of the cloud computing service use VM. We still can't have full control of the hardware spec, number of CPUs, side of hard disk, speed of hard disk, memory size and etc.
NFS solution. S3 is the most mature cloud file storage system. Up till today, it still can't replace the good old simple NFS.

S3 can be mounted to multiple EC2 instances, but it's slow. You can't stream data to S3 drives.
There no good solution to backup S3 data. With tradition NFS, we can both hardware or software solutions to back an entire disk at real time.

No LB support. Amazon just started offering LB last year. But its LB configuration is very simple. There are nothing much you can do besides simply round robin load balancing. We use F5 LB, which can be configured to do hardware based https acceleration, reverse proxy, and dynamic caching.

Here is a list of things that we do use on the cloud

EC2 computing on demand. If we want to generate tons of PDFs or video, we request new instances of EC2 and schedule jobs there.
S3 as secondary storage. We keep a copy of all user data on our NFS, then transfer duplicates to S3.
CloudFront. CloudFront is awesome. It's cheap, and it's faster.
SQS. We have more than 1000 queues running in SQS. They are persistent, and guaranteed delivery.

More on CakePHP performance related to localization

2010-07-14T21:29:00.000-07:00

It's been a while since I posted CakePHP performance related tips here. As a matter of fact, it's been a while since I posted anything here. Last year, I spent a lot of time tuning queries and adding code instruments in our system to troubleshoot performance bottlenecks. Here's more on CakePHP:

Cache all your PO files. CakePHP loads localization strings from file system. It's sloooooow. Use APC to cache your PO files. Hack it in i18n.
Cache fallback logics in localization. CakePHP has a fairly complex structure to determine which language it should use to display for a localized text. It makes multiple file reads and in memory lookups before it can determine a display language from browser's supported language header. Hack it in i10n.
Cache the entire view if you can. If the page is static, you only use CakePHP to do localization, cache the entire page. You need to hack view caching and dispatcher code.

If I have to quantify it, step 3 is the biggest performance gain, 300%+ faster, step 1 is the second on the list, 20% to 50%, step 2 should give you another 10% 30% improvements.

More fun with CakePHP

2010-01-21T21:12:00.000-08:00

Livemocha has been using CakePHP for almost 3 years now. We have done multiple iterations of performance improvements. In past posts, I mentioned CakePHP's problem with some view helper classes. Now, I want to cover few more finds.
1. Initializing models takes forever.

I don't have the exact number on hand. When I ran a profiler on our home page, which is pretty much static except a few logic around session and auto login handling, about 30% of the time were taken by Cake trying to initialize models. After digging around a bit, I found out that instead of listing all necessary models used in the controller in $this->uses, I should initialize them individually in each action.

2. A lot of caching code are just not optomized

Latest Cake (1.2) support in memory cache such as APC. However, old Cake only suppose local file caching. When Cake 1.2 was released, majority of the Cake framework code still just use the old file based caching system. This includes model caching and view caching. Disk seed is extremely slow comparing to memory lookup, it's even slower than network IO if the data size is small (under 2 K). A lot of people have ran those tests before, I am not going to list the results here. I modified Cake code to have it to use APC for caching models and views.

3. is almost useless as it is

It sounds like a great idea to be able mark a portion of the view to be not cacheable while the rest of the view is cached. However, if you read the fine print in Cake document, you'll see that controller action is not executed if the view is cached. It makes sense since Cake just bypass the controller code and go straight to the cached view. If this is the case, what is the need of having this noncache block? You can't supply dynamic data to a view, then why does the view need to be dynamic? A little thing that one can try is to add some code to retrieve data in beforeFilter, which is executed if the callback flag is set to true for a view cache. However, nobody would push any db logic in beforeFilter. Again, I updated Cake code to take a list of callback functions and execute them before it tries to render the view. This way, data retrieving can be done in small functions and can be share in the controller.

SimpleDB Batch write

2009-05-04T21:04:00.000-07:00

About 3 weeks ago, we started using SimpleDB to store user activity feeds. On average, we generate between 10 to 15 feeds per seconds. To make the feed generation process asynchronous, we send messages to SQS from our datacenter. There are three EC2 instances would then process SQS.

When the system was designed and deployed, SimpleDB just started offering "batch write" operation. However, the PHP client did not support it at the time. We had to send one message at a time. The write performance of Simple DB is pretty bad. We were only able to get 1 or 2 writes per seconds. In order to catch up with the feed generation speed, we have 5 concurrent processes running on each EC2 instance and triggered by crontab once every minute. Each process only writes up to 100 records in between 1 to 2 minutes. So the average write time is about 1 second.

Last Friday, I noticed that our SQS queue is backed up from our monitoring tool. When I checked on our EC2 instances, I noticed that the load on each instance were way too high. Somehow the write did not perform as fast as we planed, so all the processes queued up.

Today, I switched to "batch write" operation. The performance is really good. Each individual operation only takes about 1 second, but each operation can write up to 25 records. Now, a PHP process can generate up to 1000 records in 30 seconds. The average write time is 0.03 second.

NFS to S3/SQS/CF

2009-02-12T22:13:00.000-08:00

In the last 18 months, our site has been running on NFS, all user images, audios are stored on an NFS server that's mounted on each web hosts. So far, we have 5.8 millions files.

Sometime last year, we realized that we made a huge mistake in designing the NFS file structure. We only had two directories for storing all user generated files, one for image, one for audio. With more than a million files stored in a single directory, a simple "ls" command takes hours or even days. It could even bring the NFS server down to its knee if we try to delete or move files around. We made some changes to audio recording so at least audio files are stored in a hierarchy.

Another incident happend last week. Our primary NFS went down dur to some driver issue. The backup NFS didn't kick in in time so we had few hours of down time on the production site. We finally decided to move to Amazon S3 and start with user images only.

The design is simple. User still uploads images to our own NFS server. When the file upload job is done, we then send a message to SQS to indicate that a user has uploaded an email. A cron job listens to the SQS queue. When the cron job is executed and it finds a message in SQS, it then uploads the file from our NFS to S3 and change the file pointer in the DB. Our view layer code uses the pointer to determine where the file is located, either on NFS or in S3. Since we also enabled CouldFront, view layer code will then use our CloudFront URLs to render the image.

There are several key decisions in this practice:

1. We create a backup queue in our DB in case SQS is down. Chances are that it's more likely we own site goes down before SQS goes down, but we just want to be ready.

2. We already tries to write to SQS first. If SQS goes down, we write to our DB queue as long as there are something in the DB queue.

3. There's a cron job that move messages from DB queue to SQS. So we make sure the order are maintained (SQS doesn't really guarantee orderings, but we'll do our best).

4. The cron job that listens to SQS only need to worry about SQS, not the DB queue.

5. We create 2 cnames for our CloudFront and randomly pick one at a time when we render an image. This will allow browers to utilize additional threads to retreive data (most of browers only allow up to 4 connections to the same server).

CakePHP requestAction

2009-02-08T22:01:00.000-08:00

We are currently undergoing another round of site optimization. This time, I used APD to profile our sites. Since the home page is the most important page, I started from profiling it.

Total Elapsed Time = 1.74
Total System Time = 0.51
Total User Time = 1.18

Real User System secs/ cumm
%Time (excl/cumm) (excl/cumm) (excl/cumm) Calls call s/call Memory Usage Name
--------------------------------------------------------------------------------------
77.8 0.00 1.35 0.00 0.89 0.00 0.39 21 0.0000 0.0645 0 View->_render
77.8 0.00 1.35 0.00 0.89 0.00 0.39 23 0.0001 0.0589 0 include
69.1 0.00 1.20 0.00 0.83 0.00 0.36 1 0.0001 1.2032 0 main
64.0 1.11 1.11 0.76 0.76 0.33 0.33 1 1.1136 1.1136 0 apd_set_pprof_trace
48.1 0.00 0.84 0.00 0.55 0.00 0.24 20 0.0001 0.0419 0 View->renderElement
30.4 0.00 0.53 0.00 0.35 0.00 0.15 1 0.0001 0.5289 0 View->renderLayout
27.7 0.00 0.48 0.00 0.31 0.00 0.14 2 0.0001 0.2410 0 View->requestAction
27.7 0.00 0.48 0.00 0.31 0.00 0.14 2 0.0001 0.2407 0 Dispatcher->dispatch
9.4 0.00 0.16 0.00 0.11 0.00 0.05 1 0.0001 0.1636 0 MessagesController->constructClasses
8.7 0.00 0.15 0.00 0.09 0.00 0.06 1 0.0001 0.1513 0 BuddiesController->constructClasses
7.0 0.00 0.12 0.00 0.10 0.00 0.02 6 0.0000 0.0202 0 call_user_func_array
6.8 0.02 0.12 0.01 0.08 0.01 0.04 508 0.0000 0.0002 0 Set->extract
6.2 0.00 0.11 0.00 0.07 0.00 0.01 2 0.0000 0.0543 0 Dispatcher->_invoke
5.9 0.05 0.10 0.03 0.06 0.04 0.04 2242 0.0000 0.0000 0 low
5.9 0.00 0.10 0.00 0.06 0.00 0.04 38 0.0000 0.0027 0 array_map
5.2 0.09 0.09 0.06 0.06 0.02 0.02 1305 0.0001 0.0001 0 handleError
5.0 0.00 0.09 0.00 0.04 0.00 0.00 2 0.0000 0.0433 0 MessagesController->beforeFilter
4.3 0.00 0.08 0.00 0.04 0.00 0.01 2 0.0000 0.0377 0 Dispatcher->start
4.3 0.00 0.07 0.00 0.05 0.00 0.02 125 0.0000 0.0006 0 m
3.5 0.03 0.06 0.00 0.03 0.00 0.00 4 0.0068 0.0154 0 MochaLoggerComponent->debug

The highlighted lines show that the rendering takes a long time. And the time is mostly spent on requestAction. As I recall, CakePHP requestAction is really slow. It's just so happen that we have two requestAction calls in our layout, which is invoked in every single request.

Instead of using requestAction to retreive data in the layout, I added DB call in AppController and pass the data to the layout. The result is stunning.

Total Elapsed Time = 1.29
Total System Time = 0.38
Total User Time = 0.89

Real User System secs/ cumm
%Time (excl/cumm) (excl/cumm) (excl/cumm) Calls call s/call Memory Usage Name
--------------------------------------------------------------------------------------
95.6 0.00 1.23 0.00 0.85 0.00 0.36 1 0.0001 1.2346 0 main
88.6 1.14 1.14 0.79 0.79 0.33 0.33 1 1.1441 1.1441 0 apd_set_pprof_trace
11.4 0.00 0.15 0.00 0.10 0.00 0.04 21 0.0000 0.0070 0 View->_render
10.6 0.00 0.14 0.00 0.09 0.00 0.04 21 0.0001 0.0065 0 include
8.5 0.00 0.11 0.00 0.08 0.00 0.03 20 0.0001 0.0055 0 View->renderElement
5.9 0.00 0.08 0.00 0.05 0.00 0.03 123 0.0000 0.0006 0 m
4.8 0.00 0.06 0.00 0.04 0.01 0.02 123 0.0000 0.0005 0 __
3.7 0.00 0.05 0.00 0.03 0.00 0.02 1 0.0001 0.0484 0 View->renderLayout
2.8 0.00 0.04 0.00 0.02 0.00 0.01 123 0.0000 0.0003 0 I18n::translate
2.3 0.03 0.03 0.02 0.02 0.01 0.01 403 0.0001 0.0001 0 handleError
1.1 0.00 0.01 0.00 0.01 0.00 0.01 27 0.0000 0.0005 0 imgsrc
0.7 0.00 0.01 0.00 0.01 0.00 0.01 27 0.0000 0.0004 0 getUrlHashByte
0.7 0.00 0.01 0.00 0.00 0.00 0.00 2 0.0000 0.0043 0 FormHelper->select
0.6 0.01 0.01 0.01 0.01 0.00 0.00 582 0.0000 0.0000 0 ord
0.6 0.00 0.01 0.00 0.01 0.00 0.00 6 0.0002 0.0013 0 require_once
0.6 0.01 0.01 0.00 0.00 0.00 0.00 123 0.0001 0.0001 0 debug_backtrace
0.6 0.00 0.01 0.00 0.01 0.00 0.00 14 0.0000 0.0005 0 Version::link
0.5 0.00 0.01 0.00 0.00 0.00 0.00 13 0.0000 0.0005 0 VersionHelper->link
0.5 0.00 0.01 0.00 0.01 0.00 0.00 1 0.0000 0.0071 0 LanguageListHelper->nameFromCode
0.5 0.01 0.01 0.01 0.01 0.00 0.00 235 0.0000 0.0000 0 is_array

The first time around, it took 1.74s, excluding the 1.11s on the profiling call, the request was 0.63s.

The second time, it took 1.29s, excluding the 1.14s on the profiling call, the request was 0.15s.

(0.63-0.15)/0.63 = 77%.

Of course, this calculation is not all that scientific. I would have to run a controlled stress time over a long period of time to get an accurate result. However, based on the simple number difference on the render layour call, I think it's safe to say removing requestAction will pay off.

Ubuntu LAMP

2008-10-11T17:50:00.000-07:00

1. Install packages
$ sudo tasksel install lamp-server
$ sudo apt-get install php5-curl
$ sudo pear install crypt_hmac

2. Subversion
sudo apt-get install subversion

3. Create a svn project
$ sudo mkdir /home/svn
$ cd /home/svn
$ sudo mkdir curvebreaker
$ sudo svnadmin create /home/svn/curvebreaker
$ sudo chown -R www-data curvebreaker
$ sudo chgrp -R subversion curvebreaker
$ sudo chmod -R g+rws curvebreaker

4. Enable http access for subversion
$ sudo vi /etc/apache2/mods-available/dav_svn.conf
Add following lines:
<Location /svn>
DAV svn
SVNParentPath /home/svn/
AuthType Basic
AuthName "Subversion Repository"
AuthUserFile /etc/apache2/dav_svn.passwd
<LimitExcept GET PROPFIND OPTIONS REPORT>
Require valid-user
</LimitExcept>
</Location>
$ sudo htpasswd -c /etc/apache2/dav_svn.passwd xiaoj
$ sudo apache2ctl restart

5. Create trunk in subversion
$ cd ~
$ mkdir -p workplace/trunk
$ cd workplace
$ svn co localhost/svn/curvebreaker trunk
$ svn add trunk
$ svn commit -m "create trunk" trunk

Install ImageMagick & RMagick on Ubunto 8.0.4

2008-10-08T19:39:00.000-07:00

#1. Get ImageMagick
$ sudo aptitude update
$ sudo aptitude install libmagick9-dev

#2. Get RMagick
$ sudo gem install rmagick

#3. Test
$ irb -rubygems -r RMagick
irb(main):001:0> puts Magick::Long_version
This is RMagick 2.7.0 ($Date: 2008/09/28 00:23:10 $) Copyright (C) 2008 by Timothy P. Hunter
Built with ImageMagick 6.3.7 02/19/08 Q16 http://www.imagemagick.org
Built for ruby 1.8.6
Web page: http://rmagick.rubyforge.org
Email: rmagick@rubyforge.org
=> nil

Generic retry target in ANT

2008-03-30T22:34:00.000-07:00

I think the latest ANT trunk code has a new retry target. I wrote my own retry target using ANT 1.7 and ANT-contrib. It's kinda hackish, but it works well.
<target name="retry-task">
<var name="task.return" value="false" />
<for list="0,1,2" param="retry">
<sequential>
<if>
<equals arg1="${task.return}" arg2="false" />
<then>
<trycatch>
<try>
<var name="task.return" value="true" />
<antcall target="${retry-target}"/>
</try>
<catch>
<var name="task.return" value="false" />
</catch>
</trycatch>
<echo message="return ${task.return}" />
</then>
<else>
</else>
</if>>
</sequential>
</for>
<if>
<equals arg1="${task.return}" arg2="false" />
<then>
<fail message="${retry-target} failed after 3 retries" />
</then>
</if>
</target>

To use this target:
<antcall target="retry-task">
<param name="retry-target" value="some-other-ant-task" />
</antcall>

test

2008-02-29T12:00:00.000-08:00

test

MochaLogger Alpha Release

2007-11-04T19:25:00.000-08:00

PHP is an interesting language. It's designed specifically for writing web sites. Furthermore, it's designed to do mostly presentation layer work. CakePHP follows Ruby on Rails' framework. It has a somewhat well defined MVC framework, good scaffolding functions and alright performance. PHP as a presentation layer language allows the system logs to be sent to client's browser. A developer can use browser to debug server side problem quickly. CakePHP also provides this function.

Displaying system logs on the browser is very useful during dev testing. However, in real production environment, nobody wants to expose any system logs to the end users. CakePHP's file logging system is poorly implemented. It only offers two levels of logs: DEBUG or ERROR. Furthermore, debug logs and error logs are saved to two different files. This makes troubleshooting system problem by reading logs very difficult.

I'm a long time Java user. Log4J is a great logging system. It allows users to write to the same or multiple log files with DEBUG, INFO, WARN, ERROR or FATAL level log messages. With Log4J, a developer can add as detailed as needed DEBUG logs to his code without worrying about printing out too much information on the production server. He can simply set the production server's log level to WARN above only. A developer can even configure different log level for different Java classes.

In order to log more meaningful information in our system, I implemented MochaLogger. It's a plugin into CakePHP. It offers basic logging functions with 5 different levels of logs. It has an UI that allows users to configure logging levels for each PHP file/object at run time. MochaLogger is open sourced on CakeForge: http://cakeforge.org/frs/?group_id=215

2000+ new user registrations in day 1

2007-09-25T09:21:00.000-07:00

Livemocha survived day 1 with 2000+ new user registrations. We closely monitored the CPU, memory, NIC usages on all our servers. It's time to add more machines.
Come to visit us at: http://www.livemocha.com

CakePHP Performance Tuning

2007-09-13T21:43:00.000-07:00

Use JMeter, I created a load test with 5 simultaneous users hitting our login, home and logout pages. Each user had between 1 to 2 seconds delay between each step during the load test. This yelled a 3.5 request/second throughput on the server.
Before the basic performance tuning, our server response time for the home page is about 2,600ms per request. We did following tuning and reduce the page response time to 9ms.
1. Enable APC -- Alternative PHP Cache caches opcode and reduce interpreter workload during runtime. This reduce the response time by 10%.
2. Enable CakePHP cache -- CackePHP can cache a controller's result, so when a controller is requested, the view will directly render the result instead of going to the database. We simply enabled the view without make any changes to the code. This actually eliminated all the table describe code. Again, this step yelled 10% performance gain.
3. html->link function -- This is the most interesting part of this performance tuning. K.S. wrote an awesome code to profile CakePHP code. He found out that each html->link functions takes 50 to 100ms to return. When we render many links on a page, the time would add up. Instead of using fancy CakePHP link function, we just write plain HTML code for links. This gave us almost 50% performance gain.

Flickr

2007-07-24T13:36:00.001-07:00

This is a test post from , a fancy photo sharing thing.

Use Ant to deploy to remote hosts

2007-07-20T08:18:00.000-07:00

Ant is an awesome compile/package/deployment tool for Java code. But it's not limited to Java programs only. Currently, I'm working on a PHP site. We actually use ant to deploy our app. I don't really want to list the details here, but I can explain the basic steps.
1. Staging
During this step, you use ant filter feature to customize all the configuration and resources. For example, you might have different database for different environment. Developer's web server talks to Dev database, QA's web server talks to QA database, and prod's web server talks to Prod database. You use ant copy and filter feature to stage all the file from source location to a target location.
2. Packaging
Simple use ant tar and zip feature to package everything into a compressed file. You might want to add times stamp or version number to the tar file so you know which version you are deploying.
3. Deployment
To transfer the application package to a remote host, you should use ant scp task. Since we have multiple web servers, we also use ant for task which tokenize a list of host names and create a for loop so the embedded scp task will copy files to each remote host. On the remote hosts, you should have a directory for holding all the app packages. You can use sshexec to remotely create any directories.
4. Installation
You can easy write a shell script and deploy the shell script along with the application package or use ant sshexec to unpackage the application. However, I highly recommend the use of a separated shell script. Becuase you won't have to relay on ant to install your software which means you can install/rollback/rollforward your production server from anywhere.

Ant JDBC and UTF8

2007-07-11T18:13:00.000-07:00

You have to really watch out when you want to use UTF8 for your database, enabling unicode support in database, creating table with the correct encoding settings for each columns and etc. There one more catch I just realized as I was trying to use ant JDBC to load some Chinese characters into our database.
You need to specify UTF8 and unicode in the JDBC connection string.
<sql driver="com.mysql.jdbc.Driver" url="jdbc:mysql://${db.host}?useUnicode=true&characterEncoding=UTF-8" userid="${db.user}" password="${db.password}"$gt;
<transaction src="${target.dir}/database/script.mysql"/>
</sql>

CodeSith's Blog

Why is it important to follow "open standards"

Test in Production!

Create FilterChain in node.js

How Marketable Are You?

MVC now and then.

TP99?

Update on the New Job

New Beginning

More on Amazon Cloud

To "cloud" or not to "cloud"

More on CakePHP performance related to localization

More fun with CakePHP

SimpleDB Batch write

NFS to S3/SQS/CF

CakePHP requestAction

Ubuntu LAMP

Install ImageMagick & RMagick on Ubunto 8.0.4

Generic retry target in ANT

test

MochaLogger Alpha Release

2000+ new user registrations in day 1

CakePHP Performance Tuning

Flickr

Use Ant to deploy to remote hosts

Ant JDBC and UTF8