I found this post on a talk about scalability at YouTube.
I knew that they were using Python, basic tools like ssh/rsync, mysql, and leased servers. What I didn’t get was some of the detail that the speaker revealed in this talk. The interesting bits are about MySQL and Memcached and how they followed the same old patterns as every other startup.
Everybody thinks MySQL and Linux are ‘super frantastic’. When you really have scaling issues, you will really get to see how production ready these systems are. The only reason these technologies are popular are because they are free, ship in easy to install distros, and regardless of their flaws… satisfy the requirements for 99% of the sites out there. I’m no longer a technology purist, I see the value in good enough. There is nothing wrong with being free and good enough. We use these very same technologies every day. It just annoyed me when I heard previous founders and CTOs exclaim that Linux and MySQL technology were so much better than ABC. They always got burned when their traffic started moving up the Alexa ranks. When they tried to stop the burning, they couldn’t find very many posts on the MySQL or Linux kernel lists to qualm their issues. When you are scaling, you are now in a rarified place. Keep that in mind as we move on.
Its also a widely held belief by many non-technical founders that these scaling issues are solved. Hey if ‘XYZ’ is so big and they are using this stuff, it must work (ie. you don’t know what you are doing). Company XYZ is hiring lots and lots of really smart people (RSP) to solve these sticky issues. They are spending days looking at single issues (why are we getting connection resets every ### connections?) They are staring at ‘netstat -s’ output. They are coming up with workarounds and hacks that they may or may not understand. They are looking at kernel, glibc, PHP, or Ruby source. They may now be customers of some large companies that offer ‘support’ for their hardware. They all have written software internally that solves their problems (I wonder how much of that is linked with GPL software? Ah, no matter, they don’t ship CDs 🙂 This is now some of their technology or secret sauce… hacks. Some of the smarter of these companies are releasing this stuff into the open. Don’t hold your breath for Google releasing their gems (GFS, BigTable, etc.) to the open. They are smart, that really is their secret sauce.
Anecdotally, back in the day (eGroups 1999), we had a similar problem with paging as mentioned in the YouTube guy’s talk. For us, Solaris solved the paging for our Oracle databases. You could give priority to the pages backing executables so they wouldn’t get reclaimed by the VM system. Its interesting, yet not surprising, to see that it is still a problem these days. I think the presenter was surprised about this. The misconceptions about VMs by programmers, admins, and some OS people is a topic for another post. Anyways, I also had a tendency to turn off swap on machines. I like performance guarantees when I’m doing a high-performance production site. It is one thing to page out old ‘getty’ processes, it is another thing if your machine is constantly paging. Once the paging happens, it doesn’t gracefully degrade. Usually, there is so much expected performance from a machine that any consistent paging activity would start dropping the service level of a machine pretty drastically. Pagers will go off and MRTG graphs would start flying up. If you have a decent business, all of this will be ok since you will just do the quick fix of buying more hardware. Hardware is cheap compared to the $$ you can make on the internet.
The next interesting topic he mentioned was the need to re-invent replication, caching, etc on top of MySQL. Out of the box, MySQL is fine for that one big database in the middle. However, when you have to write software to manage the replication, and you are waiting on networks and disks, life can get a little white knuckled. Especially, if you are like everybody else and you write bugs. Bugs affecting a low level of your system (the persistence system), are very hard to diagnose and annoying to fix. Nobody likes to read the MySQL release notes to find out that the primary database (that you never take down, since it will take down the site) is running an old version that is slightly incompatible with the newer slaves, and hence not doing The Right Thing. This is detail oriented stuff. If you are doing this, you are in the danger zone. The presenter eventually describes what every team discovers at a certain point: we need a distributed database. A lot of sites just use a sharding technique, but this just leaves the door open for inconsistency and error. So far, nobody has solved this problem in a cheap and easy way.
One thing that has been solved in a cheap and easy way is caching. Memcached is the winner. It is easy to install, deploy, use and it is solid. Everybody uses it. I think they should add an M to LAMP for Memcached. It deserves it.
This also leads to another interesting artifact. I’ve often heard a few serial entrepreneurs exclaim that ‘man somebody needs to create a company that sells the XYZ that we all reinvent at every startup’. I’ve heard Fletcher say it and Skrenta has blogged about it. Usually this is some grid/distributed database solution. This is all good intent, but usually these same people (successful entrepreneurs) reinvent only when they have to (and they do). Competing with open source is difficult, and open source businesses are not shooting stars. Eventually they return back to their roots and focus on creating services on the internet. We’ll all let Steve Jobs create the nice shiny stuff. Fixing things would be nice, but it requires serious money and effort. The tools in this space will slowly evolve as open source solutions pop up out of the few companies at this level.
YouTube just proves that LAMMP is more than ‘good enough’ to scale. Yeah its annoying that you still have to hire really smart people. The future presentations from the future internet megaco will look very similar except there will be mention of the tool ABC which solves XYZ. Linux with Apache will still be a great combo. Regardless of the issues the presenter ran into, they still stuck to Linux, Apache, Lightty, and a few other open source bits. They didn’t have to do anything special other than buy into a CDN. The relative simplicity of these systems works. I say relative since it could be better. It is just relatively a lot better than trying to figure out the opaque magic of some of the other systems of the past (think older Solaris, Windows, etc.).
So, what am I trying to say:
- Overall, the same mistakes and technologies will be used since the established practices for ‘really large scale’ are not written or understood. Only off-the-shelf or ‘duh’ patterns will be common. (“Duh, install memcached dude”). People will continue to come up with new ways to reinvent replication for MySQL.
- Ironically, for the number of users these sites have, they don’t spend any money on software. Its hard for the baker to buy bread from somebody else. The free/open source software continues to be better and better at solving the issues. You can’t beat the price. I doubt I will see many commercial companies succeed at monetizing this space.
- People are not efficiently utilizing their grid. Google is probably the exception here. Very few people are skilled enough to adequately understand the computation of their application and tune their Disk, TCP, etc. adequately to achieve the performance levels that could be possible. They usually fly on the defaults, which on existing systems, gets them pretty far. …which leads to the next point…
- Really Smart People will still be required, but the hardware and software will progress faster at scaling… if the current workloads stay nearly the same. Said another way, systems in the near future will handle 4x? the users that the existing systems do… at the same cost. You may see more complex sites get a little farther along without as many Really Smart People.
BTW, apologies for the long post. It’s just an interesting topic.