Monday, May 4, 2009

SimpleDB Batch write

About 3 weeks ago, we started using SimpleDB to store user activity feeds.  On average, we generate between 10 to 15 feeds per seconds.  To make the feed generation process asynchronous, we send messages to SQS from our datacenter.  There are three EC2 instances would then process SQS.  

When the system was designed and deployed, SimpleDB just started offering "batch write" operation.  However, the PHP client did not support it at the time.  We had to send one message at a time.  The write performance of Simple DB is pretty bad.  We were only able to get 1 or 2 writes per seconds.  In order to catch up with the feed generation speed, we have 5 concurrent processes running on each EC2 instance and triggered by crontab once every minute.  Each process only writes up to 100 records in between 1 to 2 minutes.  So the average write time is about 1 second.

Last Friday, I noticed that our SQS queue is backed up from our monitoring tool.  When I checked on our EC2 instances, I noticed that the load on each instance were way too high.  Somehow the write did not perform as fast as we planed,  so all the processes queued up.

Today, I switched to "batch write" operation.  The performance is really good.  Each individual operation only takes about 1 second, but each operation can write up to 25 records.  Now, a PHP process can generate up to 1000 records in 30 seconds.  The average write time is 0.03 second.


Wow Panda said...

From all your blogs, I some how got the impression that you guys are using bad hardware...

CodeSith said...

Well, it's Amazon's hardware that we are using.... Maybe we are just very unlucky and always receive the few bad hardware that Amazon has to ofter. Life in "The Could".