Saturday, March 31

Geek

Always In The Last Place You Look

I've been scratching my head trying to work out why the main query that drives Minx blogs is so inefficient.  I tweak indexes and I tweak selection criteria and I tweak sorting and nothing seems to improve the query time reported at the bottom of the page.  Now, normally it's only 20ms or so, but if you catch it after an update - a new post or comment - it blows out to 400ms or so because the query cache just got flushed.

So I tweaked and I twiddled and I fiddled and fretted, and I took the query and plugged in some parameters and ran it outside of Minx, and it turned out to be, in fact, reasonably quick.

Then I tried disabling the sidebar to see what effect that had on the total query time.

19ms cached; 28ms uncached.  Perfectly reasonable for selecting and sorting and joining multiple fairly complex (albeit small) subsets of 13,000 records.

It's those darn recent comments again!

Update: And now I remember why recent comments are slow.  When you list the recent comments, you're listing the last 20 (or whatever) comments in a folder.  But comments aren't in folders, they're in threads, so you have to look at the threads in the folder, and then find the most recent comments from among the comments in those threads.  Big slow join before the sort.

There's a straightforward way to optimise the common case - use either multiple queries or subqueries to restrict MySQL to only inspect the 20 most recently updated threads.  That gets messy for the general case, though, where you want page 17 of recent comments.

Need to spend a bit more time thinking about this one.

Update: Gargh!  MySQL 5.0.27 doesn't support LIMIT in subqueries anyway.

Update: Well, it doesn't if you're using IN to restrict the main search to a subset of threads selected by the subquery, but if you use the subquery to return the update datetime of the nth most recently updated thread and then restrict the search to threads updated since then, you're swimming in gravy.

...

Which would be pretty unpleasant, come to think of it.

Posted by: Pixy Misa at 06:13 AM | Comments (19) | Add Comment | Trackbacks (Suck)
Post contains 367 words, total size 2 kb.

1 Of course it's the last place you look, because once you find it, you stop looking.

Posted by: Steven Den Beste at Saturday, March 31 2007 05:18 PM (+rSRq)

2 Rule only applies to normal people.

Posted by: Pixy Misa at Saturday, March 31 2007 05:19 PM (PiXy!)

3 New query is in place. Seems not to have broken things too badly...

Posted by: Pixy Misa at Monday, April 02 2007 03:28 AM (PiXy!)

4 Seems to be roughly 20x faster on the common worst case (query cache flushed, showing most recent comments) and no slower on the best case (query in cache) or absolute worst case (cache flushed, page n of n).

Posted by: Pixy Misa at Monday, April 02 2007 03:32 AM (PiXy!)

5 Except that it didn't work!

Well, the way it worked was that if you wanted, say, the 20 most recent comments, it found the date of the 20th most recently updated post, and only scanned the comments of posts updated at least as recently as that.

If you didn't have 20 posts, that kind of went bang.

Testing a modified version now.

Posted by: Pixy Misa at Monday, April 02 2007 12:25 PM (PiXy!)

6 Yeah, that seems to be okay too.  Take the max() of the updated datetime of the N most recent posts instead.

Posted by: Pixy Misa at Monday, April 02 2007 12:26 PM (PiXy!)

7 Okay, I'm not sure that the new version actually works, performance-wise.  Test test...

Posted by: Pixy Misa at Monday, April 02 2007 12:32 PM (PiXy!)

8 Nope.  Crap.

Posted by: Pixy Misa at Monday, April 02 2007 12:32 PM (PiXy!)

9 What if I tweak the query a little to use this new index?

Posted by: Pixy Misa at Monday, April 02 2007 12:35 PM (PiXy!)

10 No.  Poop.

Posted by: Pixy Misa at Monday, April 02 2007 12:35 PM (PiXy!)

11 Fnordles.  The new subquery works just fine as a stand-alone query, but plug it into the main query and it all sucks.  Some kind of limitation or bug in the MySQL query optimiser?  Dunno.

But since an extra query only adds a few hundred microseconds, and I've already spent hours trying to make this work, I'm gonna go for the easy way out.

Posted by: Pixy Misa at Monday, April 02 2007 12:56 PM (PiXy!)

12 Well, that's not working either.  But now that I can see the results from the subquery, I know why it's not working.

I don't know how to make it work, but I know why it's not working.

Posted by: Pixy Misa at Monday, April 02 2007 01:15 PM (PiXy!)

13 Trying again, this time with nested subqueries...

Posted by: Pixy Misa at Monday, April 02 2007 01:46 PM (PiXy!)

14 Yes?

Posted by: Pixy Misa at Monday, April 02 2007 01:46 PM (PiXy!)

15 Yes!

Posted by: Pixy Misa at Monday, April 02 2007 01:46 PM (PiXy!)

16 Hey, Webmaster! There's some loon using the name "Pixy" who's spamming your blog. If I were you, I'd ban him. (He can't possibly have anything interesting to say.)

Posted by: Steven Den Beste at Monday, April 02 2007 02:05 PM (+rSRq)

17 Thanks for the tip, Steven.

By the way, I think your network cable is plugged in upside down!

Posted by: Pixy Misa at Monday, April 02 2007 02:21 PM (PiXy!)

18 Could be; I'll check it out!

Posted by: Steven Den Beste at Monday, April 02 2007 02:31 PM (+rSRq)

19 I hope that's all it is... I don't want to have to turn my monitor upside down.

Posted by: Wonderduck at Monday, April 02 2007 04:15 PM (R/R0E)

Hide Comments | Add Comment

Comments are disabled. Post is locked.
52kb generated in CPU 0.0172, elapsed 0.1035 seconds.
56 queries taking 0.0915 seconds, 356 records returned.
Powered by Minx 1.1.6c-pink.