Amelia Pond! You're the little girl!
I'm Amelia, and you're late.
Saturday, July 29
You can get a dual Xeon 5160 (that is, 3.0GHz Core 2 Duo) server from Dell with 16GB of memory and 4 300GB SAS disks for $10,000 (US).
That offers 90% of the SpecIntRate performance of a fully-loaded Sun E10K. (Assuming I've done the conversion right.)
If you happen to be looking for a compute node (say for a large-scale blogging app) and don't need huge amounts of memory or storage, you can get it for half that price.
Interesting point: The 3.0GHz chip is only $500 more than the 1.6GHz version. A dual 1.6GHz system is about the same price as a single 3.0GHz.
Oh, and AMD have responded with huge price cuts on Athlon 64 X2s. Not so painful if you bought one of the cheaper versions like the 3800+, but if you recently shelled out for a 5000 or an FX, ouchie.
Thursday, July 27
I've been trying to get networking functioning in virtual machines (this time, using Microsoft Virtual PC) on my notebook again.
I've decided to just enslave myself to Cthulhu. The end result is the same - your brain and soul get eaten and you become a walking zombie fish-monster - but it's quicker and less painful.
Saturday, July 22
So I was scratching my head, wondering why my POST-Redirect-GET wasn't working. All I should have to do is to set the location header, set the return status to 303, and go. But all I got is a blank page, no matter what I tried.
Okay, yeah, it might help to actually set the status field rather than creating a new "Status" header.
Thursday, July 13
Not to be using CherryPy for static files.
Well, they do tell you that in the documentation. It works just fine, and makes things a lot easier to set up. But it imposes the same level of overhead for static files as it does for dynamic pages.
6ms of overhead when a page takes 10 to 30ms to generate isn't a big problem.
6ms of overhead when a page takes 2ms to fetch from the cache is more of a problem, but it's better than 160ms of overhead.
6ms of overhead for a static file is... not so good.
So now I get to play with mod_rewrite. Because I don't have a live crocodile to shove down my pants.
Update: Oh look, mod_proxy isn't enabled. So I have to recompile Apache before I can use mod_rewrite with [P] tags. I'll leave it for another day, I think.
Minx is now pie-ified. That's reduced the overhead per page from about 160ms to something like 6ms. Since a page fetched from the cache takes about 2ms to process, and an individual entry page about 12ms, that makes the whole thing just a little bit zippier.
Need to do some more bug testing and performance testing, but it seems stable in terms of speed and memory after coughing up 60,000 pages. With 10 threads, it uses 11MB of memory, though its virtual memory footprint is 114MB. Not entirely sure why it is allocating all that memory and never using it, but since the only real problem that causes is that I can't have more than about 350 threads running in any one Minx instance (119MB real, 2935MB virtual), I can probably live with it. And that only applies on 32-bit platforms anyway.
One thing I'm not doing right now is running with Psyco. Even in the worst case (cached pages), it gives a performance boost of 20%. But it also leaks memory like Netscape 4.5.
Python doesn't assign values to variables, it binds names to values.
Need to write that on a stickynote and attach it to my monitor.
Tuesday, July 11
We hates Python scoping rules. We hates them forever.
(Currently going for a doc crawl on the theory that it can't be that broken.)
Okay, the problem I'm having involves modules, threads, and thread-specific global data. I haven't solved the problem with modules yet, but it turns out there was some magic added to Python 2.4 for thread-specific globals (threading.local). That's a comfort, because I knew that worked, but I couldn't figure out how. CherryPy, the web framework I'm using, supports this under 2.3, but it turns out that it's a hack and it's very slow. So I'm not going mad. Or at least, no more than usual.
I'm moving Minx from the test design, where it is a single CGI program (and so each request is perfectly isolated and I can slap the code together any which way) to production, as a multi-threaded persistent server. Which is much more fiddly in terms of structuring the code and variables, but is twenty to thirty times faster.
Up to 95% of the time taken by the CGI version is overhead: starting a shell, then starting a Python interpreter, loading the twenty or so libraries used, opening a connection to MySQL, and so on. The multi-threaded version does all of that once. (Or at worst, once per thread, for a persistent thread pool.) It also uses Psyco, the Python compiler, which adds a 30% to 50% speed boost for this sort of app. For the CGI version, Psyco takes long enough to do the compile that the overall performance is worse in most cases...
Only because the threads weren't actually isolated from one another, it didn't work at all. I could either add an extra parameter to all the roughly 100 functions I've written so far, or I could work out how to do thread-specific globals.
Update: Okay, all is forgiven. The threading.local trick works flawlessly, even with modules. Threading-local global data for one module is not visible in another, but even if that complicates things for me, that's right. They're modules, not include files. So I have thread-local module-local global variables... Yay!
Update: And it works perfectly with CherryPy. I expected that, because it only makes sense that CherryPy would be using the standard threading module, but there's a difference between being the only sensible way to do something and actually testing it.
55 queries taking 0.7477 seconds, 356 records returned.
Powered by Minx 1.1.6c-pink.