Shard-ing made easy : Scale-out your web-apps to infinity with no code rewrite

I’ve always believed that the best solution is the simplest one. They tend to be more fool-proof, reliable, and easier to maintain in the long run.
That said, I had a hard time believing my own eyes when I found what might be the holy grail of High Availability (HA) and High Performance Computing (HPC) : easy & transparent (no apps recode) sharding, with HSCALE.

A bit about sharding : also known as “shared nothing architecture”, this will enable almost infinite scalability to any system. Google, Facebook, Flickr, MySpace – you name any high-traffic website, and there’s very good chance that they already implemented it in one way or another. Whenever they need more performance, they only need to throw another box into the system, and that’s it. Instant performance gain.
However, there’s a significant drawback – to most, sharding means a total rewrite of your application.

I’ve been providing consultation services to several high-traffic websites. They’re doing pretty well now, but some are about to go to the next level. Where their current standard 4 tier architecture; edge server – webserver – apps server – database, no longer suffices, and will need to scale up (bigger server), or scale out (more servers).
Most, however, could not rewrite their apps for sharding for various reasons; number of resources needed, development time, potential reliability issues, and so on.

Let’s back up a bit (this can be very helpful; sometimes we’re to busy charging forward, as fast as we can – thus missing all the hints we encountered along the way).
If we examine this infrastructure, almost all aspects of it are already easily scalable. Edge servers: check, you can throw another box and instant performance gain. Webserver & Apps server: checked as well, especially with PHP where it already implements the shared-nothing architecture. We can even load-balance them using software and have it outperform and out-feature the hardware load-balancers [1].
So that leaves the database as the bottleneck.

For the database, my requirements are simple (cough) :

  • Scale out, not up : There’s a limit to how big a server can be. Several high-traffic websites (YouTube, MySpace) had chosen this path and they’ve experienced the pain when they hit the dead-end – no more server bigger than what they already have. So it definitely will have to be a scale-out (more boxes) solution.
  • Transparent : No application recode.
  • Easy to maintain : I want a system that doesn’t need constant babysitting. I (we all) have better things to do with my (our) time.
  • Works for MySQL : it’s the most used database for web-apps, currently.
  • Affordable : at least, provides several pricing schemes. Not just one. We all have different needs and budgets.

Yeah, I’m picky. And that caused me serious headache – for weeks I was not able to find anything.

Several that I looked into :

  • MySQL cluster : NDB = memory/RAM-only tables? Full backup/restore everytime we add new server? No thanks.
  • MySQL replication : I looked into single/multiple master – multiple slaves setup. Doesn’t like it (1) I have systems which will need scalability for write ops as well (2) for read ops, caching (especially memcache) can provide better performance gain with less headache (3) It’s Asynchronous – meaning: the user edited his profile, clicked save; and lo, the old profile is still there (because the update hasn’t reached the slaves yet) (4) multiple master scheme seems only safe for up to 2 masters; I don’t fully understand the risks with more than 2 masters (not enough case studies).
  • MySQL partition : Good idea (although doesn’t do vertical / column-level partitioning yet, but that’s ok [2]). There are limits, but it’s not a showstopper for most. However, it doesn’t scale-out.
  • Various MySQL load-balancers, replication, HA solutions : Pick any / several of these : unproven, costs an arm and a leg, proprietary, needlessly complex, need loads of maintenance work, etc.

So it was with great disbelieve when I read the description of HSCALE by Michael Shadle : “Sharding pushed from application level, to transparent DB proxy”.

Here’s the beauty of its simplicity : It’s just a plugin for MySQL-Proxy πŸ™‚
MySQL-Proxy is a high-performance proxying daemon for (guess what) MySQL, created by Jan Knesche – also well-known for Lighttpd: one of the fastest webserver on Earth. Then Peter Romianowski decided that he’ll build his sharding-proxy upon this great foundation.
So we have HSCALE.

Amazing, this solution to high-performance database (HSCALE) is similar to high-performance Apps server (HAproxy) : through proxying πŸ™‚

Configuration is dead simple. Ability to spread queries to multiple backend servers (with some minor limitations, but it’s being actively worked on) – transparent to the application. Version 0.3 will enable us to run multiple instances of HSCALE (to avoid it becoming the bottleneck). And it’s free (do remember to contribute back if your website becomes successful because of it). With MySQL-Proxy’s scripting capability, it provides an easy & clean way to extend it to fulfill your own requirements. Etc.

Currently not many people knows about HSCALE’s existence, so I’m writing this article to let more people know about it.
At the moment HSCALE’s is already making great progress. I’m sure it’ll become better at much faster rate if more people know that it existed. So here it is.
And kudos to Peter Romianowski & Jan Knesche for their software. Here’s cheering for great future of both MySQL-Proxy & HSCALE.

A small FAQ for now, no doubt the list will be bigger as I (and you) found out more about this :

  • Q: So, come again, how does HSCALE works?
    A: First you define the partitioning scheme on HSCALE’s config file. Then it’s all automatic from there, HSCALE will partition / regroup the rows – not your application.
  • Q: What’s the difference then with MySQL’s partition feature ?
    A: You can spread HSCALE’s partitions to multiple databases (meaning: multiple backend servers), while MySQL’s only works on a single database.

OK, my head still hurts from doing marathon reading of so much documentations on this topic (transparent sharding) ! Chances are great that there are errors in this article. If you do find them, please leave a comment so I can fix it. Thanks !

In the meantime, shall we play with HSCALE in the Sandbox? Have fun !

[1] HAproxy seems indeed is the leader of the pack (apps/web server high availability + load balancers) at the moment. It can load-balance 20 Gbps with the CPU 85% idle, have loads of great features, and secure. Even swears by it (and so does several Forbes 500 companies).

[2] MySpace found out that vertical-partitioning does not scale.

88 thoughts on “Shard-ing made easy : Scale-out your web-apps to infinity with no code rewrite

  1. Wonderful!
    It kinda reminds me of Oracle RAC. Without painful configuration part and deadly 25k$/processor license part.

    Wonderful. Wonderful. Wonderful!

    I have two questions however:

    1. Concurrency. Is it good? How fast can it be? Have you tested it yet?
    2. Transaction. How does HASCALE handle multiple “INSERT” on multiple tables in single transaction. If it spread each queries unto separate ‘partition’ –which is it is desired effects–, will it leaves database in unstable state?
    3. My love/hate relationship with MySQL: How about full-text-search InnoDB? Does HSCALE handle it well?

  2. @Andry – I haven’t got the chance yet, but we will definitely be testing this extensively. For the moment, you can look at some benchmarks by the author.
    Re: transaction : I believe it spreads the INSERT ops to multiple tables / database, according to how we defined it in its config file. The config file looks similar to how MySQL partition its tables (much simplified though).
    Re: fulltext – no idea yet πŸ™‚ I’d definitely be interested in this as well.
    For more detailed information, I highly suggest you to enjoy Peter’s presentation on HSCALE. Excellent stuff.
    Note: it’s said in the presentation that multiple database capability is slated for version 2.0 – fact is, it’s already in version 0.2 (guess he forgot to update his slides). Amazing.
    Anyway, HSCALE is indeed in its early stages (although it IS progressing very rapidly). What I like very much from it is the concept. He nailed it right the first time – transparent performance gain through proxying.
    I can’t believe that other experts missed it completely, and instead beating around the bushes with various inferior solutions (example: MySQL cluster is truly a joke – I didn’t even bother myself to test it once I read about NDB being memory-only database)
    It’s going to be fun πŸ™‚

  3. @Andry – will it leaves database in unstable state?
    Sorry, missed this one – since each “partition” in HSCALE is a complete and a proper MySQL table, it will not cause any instability / data corruption.

  4. after finding this blog, i got many things to know…. especially from this post, well written and very informative post ! thanks

  5. Wow, I haven’t studied about that yet..

    but i think it’s a bit like clustering right?

    thanks for the information.

  6. And I say get a mac for scalable powerful AND simple computing. And what about new frameworks like CakePHP? Is HSCALE compatible with this?

  7. Very Informative post. It would surly help out lot of web masters in creating masterpiece web application. I am going to digg it in my favs.

  8. would surly help out lot of web masters in creating masterpiece web application. I am going to digg it in my favs.

  9. Really interesting. I have read a lot about this on other articles written by other people, but I must admit that you is the best.

  10. I have view your article and this is interesting and very useful. I need any more articles. Thanks for knowledge.

  11. Hey Harry, keep up the blogging, you really have a great talent for it!:D

  12. Thanks banget om Harry, sudah nolongin dan langsung turun tangan πŸ™‚ Seminggu ke depan ini kita coba dulu lihat kondisi. Aku server cukup stabil dan kondisi CPU dan Memory terjaga. Paling tidak untuk sementara sudah lumayan, sampai kita pikirkan lagi strategi lain. Sekali lagi, thanks.

  13. You should try living in India for a while, they have got the entire system down to art form so you end up begging the authorities to take a bribe just to do things like remove the rubbish from outside your apartment! Slow websites I can handle, authorities that make their money by taking bribes for doing what they should be doing anyway can test my patience to the limits.

  14. I used to be more than happy to seek out this internet-site.I wanted to thanks in your time for this glorious read!! I positively enjoying each little bit of it and I have you bookmarked to check out new stuff you weblog post.
    LX0-101 // 312-49 // NS0-502 // BH0-006 // 70-448 // 1z0-050 // 642-145 //

  15. IÒ€ℒve been providing consultation services to several high-traffic websites. TheyÒ€ℒre doing pretty well now, but some are about to go to the next level. Where their current standard 4 tier architecture;

  16. The post is absolutely great! Lots of great info and inspiration, both of which we all need! Also like to admire the time and effort you put into your website and detailed info you offer! I will bookmark your site!

  17. Many fashion ladies always pursue fashion Cheap Louboutins, they often pay attention to famous brand. Especially purchasing their Discount Christian Louboutin, certainly, they try to choose a pair of distinctive Cheap Louboutin Shoes. Beautifully show daily work profile. It is no doubt that Christian Louboutin Ankle Boots are a fun choice in true fashion for women. This Christian Louboutin Boots will be perfect for the weekdays and the weekends.

  18. Great post! You have a very informative article here. Thank you for sharing this tips

  19. I really appreciate this wonderful post that you have provided for us. I assure this would be beneficial for most of the people. Thanks for sharing the information keep updating, looking forward to more post.

Leave a Reply

Your email address will not be published. Required fields are marked *