Thursday, June 29
This site may hiccup a little, because I'm converting it to Minx.
Do not panic!
It doesn't help.
Tuesday, June 27
SESSION is an associative array (aka Dictionary). When the session times out, things like 'Tempfile' are no longer defined. (PHP has an unset() function that undefines a reference.) But when PHP sees an undeclared reference, it doesn't error out -- instead it substitutes '' (a blank string) if the reference occurs within a string. So now the user is executingYeah.
rm -r /var/public_www/
As you might imagine, this behavior makes PHP very dangerous in the hands of an idiot.
Friday, June 23
I hate regular expressions.
There's this thing I call information density*. Regular expressions are extremely information-dense. So are (for example) Forth and APL. With any of these, you can express a very complex algorithm in a very short sequence of symbols, but that comes with a cost.
People are used to dealing with information density within a certain range. For the most part, the information we receive has massive amounts of redundancy; you can often miss half the message and still understand it. Not so with regular expressions - every single bit matters. There are no (or almost no) cues to what is going on; you have to inspect each symbol one at a time, parse them into groups, interpret the groups, work out the relationships between the groups... And do it all correctly.
Computers are good at that. Humans not so much.
Well, computers are supposed to be good at it, anyway.
The subject arose because I needed a string-formatting language for the templating system in Minx. Python has fairly good formatting for numbers, dates, and times, but it has no equivalent formatting library for strings. It thinks it has several, but it doesn't. What it has is libraries that format things into strings, but nothing to format the strings themselves.
So I used those. And the first example - not particularly complicated - sent the template engine into what appeared to be an infinite loop. Worked fine in the examples I tested. Worked fine for the first three items on the page. Raised an exception for the next item (quite validly). And then tried to process the item after that and was never heard from again.
I'm sure I could fix it, but it remains that it happened to me, and I wrote the blasted thing. If it happens to me, then a week after launching the software I'm going to find the server with a load average of 700 doing nothing but processing regular expressions for ever and ever.
So I wrote a little text-formatting library instead.
With plugin support.
Sixty lines of code, does almost everything I need.
* There's a specific term for it, but I can't recall it at the moment. But it relates to randomness and entropy.
Thursday, June 22
It's not the memory - Memtest-86 gives it a clean bill of health.
It's not the disk - checkdsk runs fine, and my applications have no problems.
Windows says it's a disk problem but it acts like it's a memory problem.
What can cause that?
The pagefile. The evil, good-for-nothing, rat bastard pagefile.
I thought to myself If this were a real operating system, it would have a log of all these errors.
Then I thought, It is a real operating system. A crappy one, but a real OS nonetheless.
And it has crappy log files, but they exist, and they were full of errors - all relating to the pagefile.
As soon as I manage to get the darn thing to boot, I'm going to disable it.
Again. I already disabled it, but it didn't take. Who knows why; this is Windows.
Okay, it finished booting, and now has no pagefile.
Let's see if it crashes.
So far, so good. I did get one of those "delayed write failed" errors (so maybe it is the disk after all), but I managed to watch last week's episode of Haruhi Suzumiya on my new TV without anything catching fire, blowing up, crashing, or collapsing into a closed space.
Which is good enough for now.
Tomorrow I send the motherboard from my new PC back for replacement. Then I plan to (finally!) get the forms processing working in Minx. This is an example of what you can do with fairly simple templates; I intend to expand on that. A lot.
Wednesday, June 21
Either not the memory, or not only the memory.
Update: Memtest-86 is on the third pass without finding any errors.
Not the memory, then.
Not the disk drive, since I swapped that yesterday and it died within ten minutes.
It happens whether I'm on battery or mains power.
Dell is having a sale right now...
I'm just hoping it was only the memory.
Swapped drives in my notebook: BSOD within 10 minutes.
Pulled out the original 256MB of memory: No BSOD so far.
Windows Explorer has restarted a couple of times, but that is not exactly unusual. I'm going to run another checkdsk, because if the memory was playing up, there could be some nasty things lurking in my filesystem.
No progress on the media centre box; looks like that will have to go back for the friendly folks at EYO to take a look at. I've been buying stuff from them for years, and this is the first time something has just plain not worked. Video cards that don't run under Linux, sure. Network cards that are incompatible with my motherboard, yep. And the Hard Drive Destruction Bunny is always lurking around the next corner. But this is the first time I haven't been able to get a new toy to at least boot.
Wednesday, June 14
|Standard Python||Python + Psyco|
For last 500 entries on Munuviana. All times in seconds. Offer void where prohibited by federal, state or local laws, regulations or institutional policy. Offer ends June 30, 2006. Benchmarked on a Pentium D 820 running Centos 4.2. Python 2.4.2 compiled with GCC 3.4.4. Mileage may vary. Contents may ship during shrinking. Do not eat iPod shuffle. Sanitisation of comments only guarantees valid HTML. The content is your own problem.
Monday, June 12
Minx is currently parasitic* on the MT database and user interface, but it has a template system all its own.
The pages are now fully templated, with no funny stuff going on, so I'll show you what it looks like:
We can see three types of tags here:
A simple tag, [blog.name], which just looks up the matching database field for the currently active blog and inserts it into the page at that point.
A here tag, which is used to simplify block processing. The [posts.here] and [comments.here] are the most common examples of this. Without any parameters** they loop through the posts or comments using the default settings for your blog and the default post/comment template, as appropriate.
The post and comment templates look like this:
Post TemplateAgain, we have regular HTML, with just a bunch of simple tags to insert the desired values.<h2>[post.newdate]</h2><p> <b>[post.title]</b><p> [post.text] <p class="posted"> Posted by: [post.authorlink] at <a href="[post.url]">[post.time]</a> | <a href="#" onClick="ShowHide('cc[post.id]'); return false;">Comments ([post.comments])</a> | <a href="http://blog2.mu.nu/cgi/mcomment.cgi?post=[post.id]">Add Comment</a> | Trackbacks (Suck) </p> <div id="cc[post.id]" style="display:none"> [comments:here] <p class="posted"> <a href="#" onClick="ShowHide('cc[post.id]'); return false;">Hide Comments</a> | <a href="http://blog2.mu.nu/cgi/mcomment.cgi?post=[post.id]">Add Comment</a> </p> </div>
Comment Template<div id="c[comment.id]">[comment.text]</div> <p class="posted"> Posted by: [comment.authorlink] at [comment.datetime] </p>
Minx is intended to let you have full control of post and comment selection from the template, overriding the default settings, but those tags are more complex to process and don't work yet.
Finally, we have magic tags. We have two examples of this, the pager and the stats. Magic tags do magic stuff that isn't necessarily available using regular template substitution. The pager lets you go to the next/previous page of the blog (a fancier pager is coming); the stats show performance information (and a fancier version of that is coming too).
The central idea is to make as much of the feature set as possible available without making the templates scary. So you don't need to have a complex nested structure of post tags and comment tags. Just put [posts.here] in your page template (with maybe some optional parameters) and [comments:here] in your post template. Set the appropriate settings on your blog options screen and you're in business!
Next up: A user interface...
Update: Minx now pulls its templates dynamically from the MT database. If you create index templates called mxPage, mxPost, and mxComment, Minx will use them to override the default templates. (Which are loaded dynamically as well.)
You can also include templates using the
[include template] tag. The template to be included must also begin with "mx", since Minx only looks at those, but you don't actually specify the "mx" when you use the tag. So, for example, you can create a blogroll template called mxBlogroll and include it with the tag
[include blogroll]. (It's not case-sensitive.)
Update: Conditional processing with
ifn can be used with any simple tag to test whether the value of that tag is "true" or not. A tag is true if it is a non-zero number or a non-empty string, otherwise it is false. Any text between
[if tag] and
[/if] is included if the value of the tag is true. And any text between
[ifn tag] and
[/ifn] is included only if the value of the tag is false.
You can't nest
if tags, though you can nest an
ifn inside an
if and vice-versa. If you want to get more complicated than that, for now you'll need to use sub-templates with
* Or rather, symbiotic.
** And, um, parameters aren't working yet.
Rather than writing everything from scratch to start with, I put together what I had and pointed it at our existing Movable Type database (which has a number of changes over the original, admittedly). And it works.
40 milliseconds is a bit depressing, though. I'll have to see if I can speed that up a bit.
Okay, some timings: Last 200 entries at Munuviana, with inline comments: 420ms. Without inline comments: 90ms. Having inline comments causes a lot of extra SQL queries; I don't know if it's that or the text processing that's taking the time.
A normal page only shows 20 entries, and takes around 45ms with comments and 15ms without. It's a bit hard to time that more precisely, which is why I bumped the page size up.
Every post and comment has to be passed through the template engine; every comment is also processed through a SGML parser to strip out unwanted or invalid HTML tags. I'd love to get a complete page out in 10ms, but I don't think I can. Well, if it's cached, then sure, but not if the whole thing needs to be dynamically generated.
<Goes off to implement caching system.>
<Comes back again.>
Retrieved from cache, processing time 0.0 seconds.
Yeah, that'll do.
Only problem is that saving the text up to be cached takes an extra 10ms or so. I need to try the StringIO/cStringIO functions and see if they help.
Update: cStringIO provides no performance advantage at all. Which is good in that I know that the easy way to code it is just as efficient, but bad in that I can't speed it up at all.
Sunday, June 11
I've been working on the blog-code, all the live-long
It's good to be doing development again, rather than running around like a hamster on crack trying to shore up systems under attack by hackers, spammers, and just plain lousy other-people's-code.
What I've mostly been doing so far, though, is installing stuff. Python needed an update, as did Apache. Didn't have PIL installed, or Numpy or Pyrex. Pysco has been patched. Memcached and its prerequisites. Bumped MySQL up to 5.0.22, and PostgreSQL up to 8.1.4. (I may end up just using MySQL, but at this stage I'm keeping my options open.) CherryPy and Django and ReportLab. And PHP 5.1.4... (fixeds MySQL option) And PHP 5.1.4 (fixes MySQL option correctly) And PHP 5.1.4 (what do you mean, you can't find zlib? How should I know where it is?)
Okay never mind PHP 5.1.4. I'll do that later.
Oh, yes: I'm doing all of this on my notebook. Hooray for VMWare Player!
57 queries taking 0.2265 seconds, 314 records returned.
Powered by Minx 1.1.6c-pink.