Several days ago I was having problems with my web infrastructure. It seems that it got overloaded. Traffic-wise, it should not; because (1) I got a caching reverse-proxy (squid) installed, and (2) 3 terabytes of traffic is still very well within its current capacity.
Of course, other factors may change this equation, for example, when you have database-intensive pages. In this case, even several requests per minute may already be enough to overload your servers.
Anyway, I notified the parties affected by this and started the troubleshooting process. Following the usual process of benchmarking, profiling, and optimization (BPO); soon I got all fingers pointing to squid. So tried several others, varnish, nginx, ncache; all failed – but this is for another post. This post is about hype and how even IT experts fell for it.
When doing the BPO process, I got in chat with several friends which are quite well-known as IT experts. Help is always welcome, so I followed through the discussions. The suggestions were rather strange though, but all was still well. Until one suggested me to move my infrastructure to a CDN (Content Delivery Network).
I almost snorted coffee through my nose 😀
(I really should have it by IV drips to prevent this from happening again in the future, but anyway…)
A bit about CDN – it’s basically a network of servers all over the world, which hosts the same set of data. Then when a visitor requested the file, it will be served from the server closest to its location. So the visitor will be able to fetch the data with maximum speed.
That’s basically how a CDN works. There are variations, but this is the basic of it.
The problem with using CDN :
(1) CDN is for static contents : Facebook users probably have seen their browser’s status page showing lines such as “Loading static.ak.fbcdn.net”. That’s Facebook’s own CDN. Notice the first word at the beginning of the domain name? Yup, static.
There’s a reason why CDNs are for static contents. Static contents are easier to synchronize and deliver through the whole network. You can, indeed, synch and deliver dynamic contents through a CDN — but the level of complexity jumped by several magnitude at the instant. And so is the cost. Which brings us to the second reason,
(2) Cost : standard CDN will cost you at least 5x of your normal bandwidth costs.
SoftLayer.com brought a breakthrough in this case, where their CDN costs “only” twice the normal bandwidth.
However, it’s still 200% more expensive, and my web infrastructure hosts dynamic contents, which may change by the minute — so it’s absolutely out of the question.
If that friend is willing to foot the cost, then I’m willing to play with the CDN. It makes things more fun with none of the pain 🙂
Anyway, I’m still amazed at how even IT experts fell for hypes. I know CDN sounds cool & hip & sophisticated and so on, still, personally I prefer hard proof. Especially by proving any claim by myself.
But each to its own I guess. Just try not to misled others by spreading the hype too, okay?
Repeat after me – CDN is NOT a silver bullet. And as we all knew already, applying the wrong solution to a problem will just cause even more problems.
Regarding my problem, I solved it by moving squid’s cache to a different disk. Looks like the previous disk was defective. Including some further tweaks, the performance now almost doubled compared to before the trouble begun. Some of the websites fully loads in as little as 2 seconds. Not bad.
Performance-wise, it’s now alright. But my work still continue to further expand the capacity of my web infrastructure. For now, the customers are happy.
That’s what matters.