Monday, October 18

Geek

Mongo Smash!

Anyways, there's this site called Foursquare, and apparently it fall down go boom just recently.  Possibly more than once.

They're running MongoDB, a database I tested and rejected earlier this year when looking for more elegant substitutes for MySQL.  I don't like SQL - never have - but it works, and MongoDB...  Doesn't.

Foursquare had two database servers with their database split ("sharded") across them.  The servers are Amazon EC2 instances, with 66GB of RAM each, fairly large by most people's standards.  The problem arose when one of the database shards got bigger than the available RAM.

...

If you had any experience with databases before, say, 2007, you'll be shaking your head in disbelief at this point.  If you had any experience before, say, 1997, you'll be dumbfounded.  Yes, the site fell over because the database didn't fit in RAM.

The total database size at the time of the outage was around 120GB.  At my day job - and though Foursquare is a small startup, my day job is at an even smaller startup - we add 400GB of data to our databases every single day.  Mind you, we do things to MySQL that would make the average DBA give up and go into volcanology as a safer bet, but it works.

Now, when I was considering MongoDB, the first thing I did was test its performance.  The second thing I did was test how it behaved in out-of-memory conditions.  (It crashed.)  Seems that Foursquare forgot to ask the second question.

Posted by: Pixy Misa at 11:54 PM | Comments (8) | Add Comment | Trackbacks (Suck)
Post contains 254 words, total size 2 kb.

1 Waiting for J to comment.

Posted by: Pete Zaitcev at Tuesday, October 19 2010 12:56 AM (9KseV)

2 Sharding actually made the situation a lot worse for them. The chunk-split code was spectacularly expensive in the first 1.6.x releases, and was the worst possible thing to do when you were running out of physical memory. They're rapidly improving it, but when you combine the effect of database fragmentation on their memory-mapping, the slow and costly chunk splits, and the server-wide locks, Foursquare was facing a messy recovery even without any bugs.

Without sharding, their performance would have degraded gradually as the working set exceeded available RAM; if you can avoid full table scans, 2x RAM is still fast, and 3x is pretty good (modulo virtual-server bugs). With sharding in its current state, they either needed to start out with twice as many shards, or put SSDs into them to reduce the cost of paging (not really an option in Amazon's cloud).

I've spent a lot of time talking with the guys at 10gen, and they're smart and capable, but I make sure I never forget that Eliot's dev laptop has a fast Core i7, 8GB of RAM, and a 512GB SSD. :-)

(...and my MongoDB server would dance with joy if it had 66GB of RAM or an SSD...)

-j

Posted by: J Greely at Tuesday, October 19 2010 01:14 AM (2XtN5)

3 I was thinking of you as I wrote that. wink


Posted by: Pixy Misa at Tuesday, October 19 2010 12:33 PM (PiXy!)

4 Heh. In many ways, my choice of MongoDB was driven by Someone Else's Project. It's going to be used in a much more important piece of our back-end service, so while they were off designing and architecting, I was smoke-testing it in an area with no direct customer impact. We now have nearly four months of solid experience with it, and the SEP still hasn't hit QA, so we're way ahead of them.

Significantly, when I had Eliot and Roger on site a while back, one of my questions was about dealing with fragmentation. At that time, it had only just become possible to query the server and find out how badly fragmented you were, and the now-urgent online defragmentation (as opposed to tedious copy/promote or offline repair) was still in the "yeah, we can do that at some point" stage.

After a few months of deliberate testing, my collated daily archive collection is now heavily fragmented, to the point that even though it has over 400GB of free space, it still occasionally allocates a new 2GB extent for indexing. Someday I'll need to free up a terabyte on another server, copy the collection to it, then make it the new master. Before I fill the disk with half-empty datafiles.

-j

Posted by: J Greely at Tuesday, October 19 2010 02:27 PM (2XtN5)

5 "Mongo" sounds like the name of a pro wrestler.

Posted by: Steven Den Beste at Tuesday, October 19 2010 02:40 PM (+rSRq)

6 OK, no sharding with Mongo. Got it. A new tenet of cargo cult engineering is being committed to stone tablets as I post this comment.

Posted by: Pete Zaitcev at Wednesday, October 20 2010 01:04 AM (9KseV)

7 Oh no. Is choice-of-databases a new religious war, the way OS's have been, and development languages?

Posted by: Steven Den Beste at Wednesday, October 20 2010 02:09 PM (+rSRq)

8 Oh, it gets worse. You always had the war between Oracle, Postgres, Mysql, and MS Sql Server, with the DB2 guys hanging out in the corner looking smug. Now you have Schema versus Other, with a large number of fundamentally different technologies lumped together under the "NoSQL" label, which some of the less strident have begun claiming means "Not Only SQL" instead of the obvious.

They get huge buzz because it's easy to throw something together in a real hurry and pitch the demo to an investor, usually in combination with Ruby On Rails or another rapid-prototyping framework. And, just like Rails, they tend to fail in interesting ways when you take your demo and put it into production. The MongoDB rant Pixy linked to covers it nicely, and the followup on Ruby nails it to the wall (while sadly overstating the virtues of PHP, but that's a side issue...).

It's easy to start throwing data into MongoDB and slapping a web interface on it with Rails. It's also often a substitute for defining the actual problem and designing the right solution. Pixy's blogging on the subject is more effort than most people put into their VC pitch, much less their code. :-)

-j

Posted by: J Greely at Wednesday, October 20 2010 04:45 PM (2XtN5)

Hide Comments | Add Comment

Comments are disabled. Post is locked.
52kb generated in CPU 0.0162, elapsed 0.2919 seconds.
56 queries taking 0.2817 seconds, 334 records returned.
Powered by Minx 1.1.6c-pink.