{"id":1428,"date":"2007-03-08T15:40:27","date_gmt":"2007-03-08T08:40:27","guid":{"rendered":"http:\/\/harry.sufehmi.com\/archives\/2007-03-08-1428\/"},"modified":"2009-01-17T16:25:58","modified_gmt":"2009-01-17T09:25:58","slug":"high-load-website-optimization-ilmukomputercom","status":"publish","type":"post","link":"https:\/\/harry.sufehmi.com\/archives\/2007-03-08-high-load-website-optimization-ilmukomputercom\/","title":{"rendered":"High-load Website (WordPress) Optimization : IlmuKomputer.com"},"content":{"rendered":"

Mr. Romi, founder of IlmuKomputer.com<\/a> (IKC), yesterday asked me to help optimize this website. A bit about IlmuKomputer.com, it means “Computer Knowledge”, and contains a lot (and I mean it) free high quality computer tutorials.
\nAs you can easily guess, the website is very popular. On peak hours, it’ll usually become overloaded, and will become unresponsive. <\/p>\n

I’m only too happy if I can be of assistance to IKC’s team in their good cause. So I started working on it with help from one of my staff, Yopi.<\/p>\n

Turned out that what we’ll be doing will be very different with what most others do. Anyway, IKC is a very popular website (and “slashdotted” daily, by leechers), so what works for most others doesn’t work for us.<\/p>\n

The Bottlenecks<\/strong><\/p>\n

A bit of background – IKC uses WordPress as its CMS. It’s a very nice CMS, and makes your life easier. I’ve used WP myself since version 1.5.x. However, being database-based, there are a lot of points within its a WP-based infrastructure which can become a potential bottleneck. So if your website started to become popular with this CMS, you will need to start optimizing it.<\/p>\n

After examining the situation for a while, it’s clear that MySQL was THE bottleneck. Output of top shows it using at least 8 times of CPU time than other service. Mr. Romi also told me how it kept on falling down on peak time.<\/p>\n

Apache (and PHP, since it’s compiled as Apache module) is the next one; with each of its process using more than 10 MB of RAM. This may seem insignificant at first, but multiply that by (potentially) 150 processes – and you’ve got quite a memory hogger here.
\nAlso CPU-usage wise; I’m quite surprised to see that each incoming request will cause the particular process’s CPU usage to spike to more than 50%.<\/p>\n

Initial actions<\/strong><\/p>\n

I asked Mr. Romi to increase the size of MySQL’s internal cache size. He did, but the machine still fell down in daily basis. <\/p>\n

He has also implemented caching on the app server (PHP) by way of wp-cache plugin. Still no joy too.<\/p>\n

The Edge<\/strong><\/p>\n

I decided that we need to go straight to the “edge”, and stop the load there. <\/p>\n

I proposed that I setup Squid in HTTP Acceleration mode. This way, most of the requests won’t even touch Apache, much less MySQL. Squid will bear most of the load, but since it’s very efficient, it should be able help a lot in making the website perform better.<\/p>\n

Since I’ve got a few things to do myself, I asked Yopi to setup Squid in our test machine.
\n I just gave him pointers now and then, yet he managed to finish testing the setup and implement it in IKC’s server in just about 3.5 hours.<\/p>\n

Then I showed him “tail -f \/log\/squid\/access.log”, and we watched in amazement on how quickly the TCP_MISS lines are changing to TCP_HITs.
\nAfter about 12 hours, I increased the cache_mem size, and the TCP_HITs are slowly changing to TCP_MEM_HITs.<\/p>\n

The result<\/strong><\/p>\n

Squid is working as we expected.<\/p>\n

Average server load dropped from 30% plus to about 3%. While squid’s CPU usage increased from 0% to an average of only 2%. A very nice trade off.<\/p>\n

After about a month, I checked the website’s logfiles, and saw some very nice numbers — traffic to IlmuKomputer.com has doubled<\/strong> ! Needless to say, Mr. Romi is very happy with it.<\/p>\n

I also found that everyday there will be people downloading the contents using crawler software – such as Teleport Pro, wget, etc. I asked Mr. Romi if he’s got problem with it, and he says no. It is his mission to spread knowledge for free after all. So I let these leechers alone.<\/p>\n

Come to think of it, it’s possible that these crawlers are the ones causing IKC server to fell down at peak hours. Example, Teleport Pro is able to download 10 links simultaneously at the same time. Then once any of it is finished, it will instantly start download the next one. When all 10 downloads access the database, and many crawlers at the same time, not many servers will be able to stand up to it. It’s like being machine gunned wearing just a simple leather cloth. If you have had the experience of having your website linked from Slashdot or Digg, you’ll understand what I’m talking about.<\/p>\n

In this case, squid acted as a thick titanium armor, and taking most of the hits for your server. I suspect now the number of crawlers has increased than before, but it shouldn’t be a problem.<\/p>\n

MySQL is a bit strange though. Sometimes its CPU usage can be as high as 160%. Thankfully this is very rare, so it’s probably just some internal clean-up routine. <\/p>\n

One day, after happily watching the low load on the server for a while, suddenly everything froze. Even my SSH connection. Attempts to reconnect to the server failed.
\nAfter a while, I was finally able to connect again. Looking around, I noticed there’s some sort of bandwidth limiter daemon running on the server. After consulting with Mr. Romi, I killed it. The problem stopped.<\/p>\n

Happy ending ?<\/strong><\/p>\n

I’m still monitoring the server as we speak for glitches. For example, squid seem to hang from time to time. This can be caused by anything from bad memory to problem with specific hardware configuration; so for now I’ve setup a cronjob which will restart it in certain intervals.
\nIt seems to help, so I can troubleshoot the problem in peace.<\/p>\n

Anyway, I’m sure that with the increased availability, even more people will visit the website (Ed: confirmed!). Then at some time in the future, we may find the server overloaded again.<\/p>\n

In that case, there are still many things which we can do to keep IKC up & running in just one server :<\/p>\n