Tuesday, May 02

Life

Well, Yeah

I'm working on the technical details for a new business plan that requires 24 x 7 server and network uptime. I have it easy at the moment, since the company I work for is basically 12 x 5 for the in-house stuff. There's a lot of 24 x 7 stuff too, but the responsibility for that falls in other people's laps.

So I'm sweating bullets on network designs. You know the sort of thing: This router is pretty bulletproof, but if it does go down, then... Okay, we can put a backup here, and we can tweak it to take over the IPs automatically... But if this network link goes down, we still lose half the business, so let's split that... And so on.

But really, things go kerflooie all the time. I arrive home, to find no internet. Why is this, I ask. The answer comes:

Hi all,

Just to let you know customers connected to the following exchanges may be currently unable to the Internet:

Liverpool
North Ryde
Manly
Menai
Miller
Minto
Miranda
Mona Vale
Mosman
Northbridge

This is believed to be a fault within an upstream provider's network and we are working to have it resolved as soon as possible.

Well, that's just ducky. As someone noted:
Strangely looks like a Sydney suburban dictionary attack!
Curiously enough, I don't live in any of those places. What's going on?
Our provider has lost power to a switch in Sydney. This has taken out one of our aggregate links to the above exchanges. We are waiting to hear back from them.
And this affects me because?
As a result of the work being performed to resolve the original fault the following exchanges are now also affected:

Balmain
Fairfield West
Castle Hill
Coogee
Epping
Hornsby
Hurstville
Maroubra
North Parramatta
Randwick

Ah. Nice one.

Update: See also: TypePad, Hosting Matters. No finger pointing, just noting that shit happens. Perfectly redundant and fault-tolerant systems are so expensive and complicated that (a) no-one can afford them, so they don't get built, and (b) no-one can understand them, so they fail anyway because of human error.

Which doesn't mean you don't make the effort. We haven't had a power outage at our new office since we moved in (February '05) but I'm still budgeting for dual UPSes. (I just checked one of the web servers - 433 days uptime, and that one isn't on a UPS.)

Posted by: Pixy Misa at 06:46 AM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 398 words, total size 2 kb.

Comments are disabled. Post is locked.
45kb generated in CPU 0.0141, elapsed 0.1372 seconds.
54 queries taking 0.1274 seconds, 335 records returned.
Powered by Minx 1.1.6c-pink.