Ambient Irony

They are my oldest and deadliest enemy. You cannot trust them.
If Hitler invaded Hell, I would give a favourable reference to the Devil.

Wednesday, May 25

Situations Vacant

At my day job, we're looking for two or three sysadmin / programmer types. The field is real-time social network and web analytics.

Location: Sydney, San Francisco, New York

Required: Solid working knowledge and at least two years practical experience with Linux system administration, Python, and databases.

Useful: MySQL, Redis, Cassandra, Xapian, RabbitMQ, PHP, Apache, Nginx, memcached, Mercurial, networking, virtualisation, system monitoring tools.

Other knowledge: Statistics, parallel processing & scalability.

Not of interest: Java, .Net, Microsoft platform in general

CS or similar degree is valuable but not critical if you have the equivalent practical experience.

Training will be provided at our main office in Sydney; we'll fly you out here for a few weeks if you're based in SF or NY.

Good salary and benefits (commensurate with experience and talent), flexible working hours, opportunity to telecommute part time. Will be on an on-call roster for system outages.

Email me (use help@mee.nu) if you're interested or have questions and I'll put you in touch with the right people. Oh, and I'll probably be doing the technical interview. wink

Posted by: Pixy Misa at 02:42 PM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 179 words, total size 1 kb.

Monday, May 16

Another Story

One of the drives in Nagi (my Windows box) is on the way out, as evidenced by system free

zes that leave the drive light solidly on but no

thing happening. If I don't mess around too much I can still move my mouse and maybe switch to an app

lication that doesn't want to access the disk right now.

This is not a good thing.

Not good.
So I went out to get some external drives to back everything up. My friendly local electronics and computer stuff store offered several options: A 1TB Western Digital MyBook Essential drive for $129.99, a 2TB Western Digital MyBook Essential drive for $129, and a 3TB Western Digital MyBook Essential drive for $269.

I love that kind of pricing; it makes decisions so easy.

Anyway, I bought four of them.

I actually have more than 8TB of internal disk across all my computers, quite a bit more, but a fair chunk of that is backups, and backups of backups, and really bad anime that I'll never ever watch, and about a terabyte of Steam content which I can download again with one click (and two weeks of waiting).

So I'm running backups.

Running backups.
I have enough spare disks sitting around to replace all the drives in Nagi, mainly because I bought them with the intention of replacing all the drives in Nagi.* So once the backups are done that's probably what I'll do.

But first I'm going to get me a USB 3 card, because as things are, just restoring my C drive would take me more than 24 hours.

The drives themselves are quite small and neat, certainly smaller and neater and much less flaky than my previous Western Digital MyBook experience. That was a 500GB drive that I bought for $250 not all that long ago.** It worked, for a while, but then it would go into death sleep (rather like a ferret) from which the only way it could be awoken was to unplug and replug the power cord (rather like a ferret).

Western Digital MyBook Essential (left), ferret (right).
Not terribly convenient. It never actually lost any data or failed while it was actively in use, but it was annoying enough that I ended up just filing it away in a drawer.

So far my new MyBooks are working flawlessly. Which is good, because there's fundamentally only two ways a disk drive can work: Flawlessly and not at all.

* And that is because all the drives in Nagi are the infamous death-by-ring-buffer Seagate 7200.11. The gist of the story is this: The drives have a ring buffer in non-volatile memory to store the last 256 SMART alerts. But there's a bug such that if you power on while the pointer is on the last entry (255), rather than going back to zero, it increments to 256 and overwrites the drive firmware. In other words, if you have a drive that's not quite perfect - even running a little warm - then every time you turn your computer on there's a 0.4% chance that your drive will brick itself. As an added bonus, Seagate's first patch for the problem also bricked your drive. So far my drives have survived unpatched and unbricked.
** But several centuries in computer years.

Posted by: Pixy Misa at 07:02 PM | Comments (12) | Add Comment | Trackbacks (Suck)
Post contains 553 words, total size 4 kb.

1 Oh Great and Awesome Pixy, I am having troubles with rogue "more"s showing up on posts for no good reason that I can tell (see the current top post at The Pond for an example). Please to be helping?

Posted by: Wonderduck at Thursday, May 19 2011 10:20 AM (n0k6M)

2 I think I know what it is. What browser do you use? The editor acts differently depending on your browser, and sometimes in will stick a blank line in the "more" box all by itself.

Posted by: Pixy Misa at Thursday, May 19 2011 01:27 PM (PiXy!)

3 That hasn't happened to me yet. I'm using IE9.

Posted by: Steven Den Beste at Thursday, May 19 2011 02:36 PM (+rSRq)

4 I'm using Firefoxy 3.6.somethingorother. I've heard some not-so-great things about FF4, so I'm waiting to upgrade it.

Posted by: Wonderduck at Thursday, May 19 2011 02:56 PM (n0k6M)

5 Okay. (I'm using FF4 by the way - no problems except for the disappearance of the status bar.)

Posted by: Pixy Misa at Thursday, May 19 2011 05:23 PM (PiXy!)

6 Wonderduck, I think I've fixed it now. All the rogues are gone and shouldn't come back... Until the editor changes again.

Posted by: Pixy Misa at Saturday, May 21 2011 08:37 AM (PiXy!)

The problem with pasting is still there. Here's how to reproduce it:

Type some stuff. Then go elsewhere and copy a bit of text. come back to here and put the cursor at the end of what you typed and then paste the stuff you copied. It will appear at the beginning of the editing box, not at the end where the cursor is.

Posted by: Steven Den Beste at Saturday, May 21 2011 10:16 AM (+rSRq)

8 Ugh. For me it pastes at the end, but the browser then jumps to the top of the page.

Which I haven't noticed while writing posts, because the whole page fits on the screen. With comments, it's a mess.

I'll try it with IE9 and see what I get.

Posted by: Pixy Misa at Saturday, May 21 2011 03:38 PM (PiXy!)

9 Try doing a paste in the middle.

Posted by: Steven Den Beste at Sunday, May 22 2011 01:08 AM (+rSRq)

10 Okay, I get different bugs than you, but it is pretty terrible. Worked perfectly in the old version though.

Let me do some more testing and see what I can work out.

Posted by: Pixy Misa at Sunday, May 22 2011 09:11 AM (PiXy!)

Hide Comments | Add Comment

Improvements

USB 3.0 is full-duplex, catching up with serial ports of, oh, 1970 or thereabouts.

I was curious as to how close it is to PCIe 2.0. The low-level encoding is the same (8b/10b) as is the raw speed (5Gb/s). Beyond that there doesn't seem much detail floating around unless you download the entire specification.

So I did.

I was wondering whether the weird connectors on my new external drives* were standard or some propietary Western Digital nonsense. Turns out they're standard micro-USB-3 connectors. Which is good and bad; good because they're standard; bad because the standard is horrible.

Turns out USB 3, to maintain backwards compatibility with old-and-busted USB, includes old-and-busted USB.

Bearckwards compatibility.... Sorry.
That is, the cable and plugs and sockets and controllers all provide the two differential pairs for USB 3 transmit and receive (four wires total) plus the original USB 1/2 bus (two wires) plus power and ground (two wires). Which makes the connectors twice the size (except for the standard A-type connector (the flat one) which sneakily hides the four new contacts) and the cables twice as thick.

I can understand the need to switch from a turnaround bus to a proper full-duplex point-to-point connection. I'm surprised USB 2 even works as well as it does, having to turn around the connection constantly at 480Mb/s.

USB turnaround.
But this sort of kitchen-sink compatibility never turns out well.

On the other hand... 5Gb/s.

* Another story.

Posted by: Pixy Misa at 01:43 PM | Comments (1) | Add Comment | Trackbacks (Suck)
Post contains 241 words, total size 2 kb.

Sunday, May 15

Ohhhhhh

It's a Javascript thing.

IE lets Javascript copy from a field to the clipboard. Other browsers don't.

That means that unless you're running IE, the cut/copy buttons won't show up in the editor.

The paste-from-Word button isn't showing in Firefox with the new editor either, and that is supposed to work. (Edit: Fixed!)

Also, that nasty habit of inserting a line break into the More field has returned, only now it's slightly different. (Edit: Fixed!)

Posted by: Pixy Misa at 02:22 PM | No Comments | Add Comment | Trackbacks (Suck)
Post contains 76 words, total size 1 kb.

Thursday, May 12

Sneaky Buggers, But In A Good Way

There are two kind of flash memory: The expensive enterprise kind, called SLC, and the cheap crappy kind that everyone actually uses, called MLC.

The difference is that SLC stores one bit in each memory cell, while MLC stores two or even three. MLC does this trick by varying the voltage... Or is it charge... The something levels of the cell, so where an SLC cell is either on or off, an MLC cell has four or eight distinct levels of onness or offness.

The good thing about this (which is why everyone does it ) is that you get two (or three) times as much storage in a given amount of silicon.

The bad thing about this is that it's two (or four) times less robust. Actually, in practice, MLC is twenty or thirty times less robust than SLC.

SLC vs. MLC. MLC (right) has already suffered data corruption.

All flash memory cells have a limit to how many times they can be erased and overwritten before they get clogged up with discarded electrons and stop working. With SLC, this is on the order of 100,000 times. With current MLC, it's on the order of 3,000.

This isn't a huge problem with file storage, because you tend to write a file to disk and leave it here. Every so often you'll delete a bunch of old files and create a bunch of new files, but the turnover isn't huge.

With database storage, things are completely different. Every time you update something in a database, the updates have to be written back to disk, along with any changes to the indexes. A single new record can easily trigger a dozen disk writes. A busy database can fry a standard MLC SSD.

But, because MLC drives are cheaper, everyone buys those, and economies of scale kick in and MLC gets even cheaper. SLC drives now run around $12 per gigabyte, and MLC less than $2, even though SLC only really costs twice as much to produce.

How do you resolve this problem? Particularly when you want to move your entire server to SSD, but might only need 10% of the disk to be enterprise-database quality SSD?

Well, if you're Toshiba and Sandisk, what you do is make the flash memory block-configurable between MLC and SLC. Well, pseudo-SLC. Rather than writing only 1 or 0, for specifically selected blocks you can only write 11 or 00. If that cell is flaking out and shifts down to the 10 level or up to the 01 level, not only can you detect and correct that at read time, you can mark that block as bad and allocate a spare block in its place. So you have improved margins and better error detection.

Toshiba representatives demonstrate their new block-configurable flash devices.

Micron and Intel have announced something called eMLC - enterprise MLC - but have been short on details so far. I'll be surprised if it's not something very much like this. I'm hoping also that it will be twice the price of regular MLC, rather than six times.

Toshiba's block-configurable trick is even better, but there's not even the ghost of a standard of how to configure different parts of a single storage device to provide a different density/reliability tradeoff, so it will be a while before that idea hits the general storage arena.

Pictures from A Channel and Ano Hana via RandomC.

Posted by: Pixy Misa at 11:27 PM | Comments (2) | Add Comment | Trackbacks (Suck)
Post contains 576 words, total size 4 kb.

Self-Similar Loads And The Deaths Of Cloud Computing

The recent collapse of not one, but multiple entire Amazon Availability Blooples* into a smoking crater caused a certain amount of buzz in the webosphere. It would have caused more of a buzz if it hadn't reduced a fair chunk of the webosphere to a smoking crater along with it.

What happened?

Well, someone at Amazon threw the wrong switch during a network upgrade. Effectively, instead of rerouting traffic onto a carefully planned detour, they rerouted traffic onto the sidewalk.

This did not go over terribly well with all the servers trying to send data to their storage pigs* further along the sidewalk. Since there's a significant variability in performance of Amazon storage pigs* many servers were set up to take any slowdown as an indication of a bad pig* and automatically try to set up a new pig* to replace it. To do this, the data had to be replicated....

Along the sidewalk, which was already jammed beyond capacity.

To say that the problem snowballed at this point would be to waste a perfectly good video involving mousetraps and ping-ping balls.

You see, the idea of setting up a huge hosting cloud thingy like Amazon has done is that most servers run mostly idle most of the time. (Ours, for example, has 12 cores and uses, on average, slightly less than one.)

So if you aggregate a whole lot of servers together into one huge bloople* you can get far more sites running on far less hardware and make a huge amount of money in the process. Until someone drops a ping-pong ball; once that happens there's no way to stop the process. It's far too big and complicated to control manually. The entire bloople* is set to burn down, fall over, and sink into the swamp and all you can do is watch.

All you can do is watch...

Because traffic (and hence load) doesn't neatly average out when you aggregate lots of different services together. Instead, it piles up. Internet activity levels are self-similar - everything everywhere tending to follow the same pattern of spikes and dips at the same time.

When one service spikes, it's likely that everything else is spiking at exactly the same moment. And since cloud computing gains efficiency by eliminating the huge amount of headroom you would traditionally plan into a dedicated server (or server farm, depending on how many shoestrings you have to throw around), this leads to everyone looking for extra capacity at the same moment. And that puts more strain on everything right when it's at its busiest, and....

Splat.*

In Amazon's case, the splat* was triggered by someong dropping a ping-pong ball. But that's just the proximate cause. People drop ping-pong balls every day. It's only a drama if you happen to have covered every level surface of your home including the ceiling with fully-armed spring-loaded ping-pong ball launchers.

But that's what every cloud provider, almost without exception, has done. That's the entire business model. It is cheap, but it's intrinsically flaky.

Intrinsically flaky.

It's no accident either that the piece of the puzzle - uh, bloople* - that flaked out in this was the flakiest flake of all, the network-attached storage. Amazon's EBS gives you disks attached across a network.

Disks suck. There's no gentler way to put it. At my day job, we have SSDs all over the place, because we'd be dead without them. (We know, because we tried that at the start. We died. Then we went out and bought a bunch of SSDs and tried again.) Disk access is on the order of ten million times slower than CPUs, and modern servers typically have more CPUs than disks.

Even so, when your disks are right there in your server, at least you can see how busy they are (too busy) and who's using them (you). When the disks are abstracted away to free-roaming data pigs*, all you have is an end result. Pig* too slow? Don't try to investigate the problem. You can't investigate the problem; it's been abstracted to such a degree that there's simply no information available. People tried mounting new pigs* because that was the only thing they could do. They were throwing gasoline onto a bonfire, but when you build a bonfire and hand everyone a free can of gasoline, you really shouldn't be surprised at the result.

So, how do we fix this?

Well, first, everyone everywhere who has anything to do with anything at all should be nailed to the floor and forced to read J. B. S. Haldane's On Being the Right Size.

Second, anyone planning to deploy a new server with disks used for anything other than backups and log files should be lightly shot.

Third, watch Ano Hana.

* The technical term.

Pictures from A Channel and Ano Hana via RandomC.

Posted by: Pixy Misa at 10:11 PM | Comments (3) | Add Comment | Trackbacks (Suck)
Post contains 815 words, total size 6 kb.

1 The question is, who's going to pay the price of your suggestions. Actually what's interesting, the costs of most cloud providers are not far removed from dedicated. And for the storage, dedicated is better (same price at Rackspace Cloud buys 10GB (which is what yukiho.zaitcev.us is), but 400GB at Pacific Rack (where mitsuki.animeblogger.net used to be)). Your operation may be just big enough to have a few servers fully utilized, which puts you in the sweet spot. Anyone who's too smal or too big has to go cloud. I pay $11 for yukiho, but $80 is the smallest box available otherwise (Dreamhost? Don't make me laugh. It's pure cloud too, only without cloud's convenience and programmatic acess, and with mafia customer service).

Posted by: Pete Zaitcev at Friday, May 13 2011 02:23 AM (9KseV)

2 I was actually going to title this piece Self-Similar Loads And The Death And Death And Death Of Cloud Computing, but it was kind of long. Maybe Self-Similar Loads And The Deaths Of Cloud Computing. Yeah, that'll work.

The point isn't that cloud computing as a paradigm is bad, but that we're going to see repeats of this outage until everyone understands that (a) there are always diseconomies of scale to contend with, and (b) if you need robust shared storage, you can't use a big pool of network-attached disks. You just can't.

Joyent seem to have recognised these points; their blog can be both interesting and entertaining.

Amazon's AWS, on the other hand, is the classic beautiful implementation of a terrible idea.

Posted by: Pixy Misa at Friday, May 13 2011 02:55 AM (PiXy!)

3 Where was I?

Oh, yeah.

The brilliant thing about modern SSDs is that accesses are fungible. With a decent controller, it doesn't matter whether you're reading or writing 4KB blocks or 1MB blocks; you get similar performance. (Early commodity SSDs - just a few years ago - were mind-bogglingly awful at random writes, but that has since been resolved.)

With disks, accesses are most definitely not fungible. Reading random 4KB blocks you might get 0.5 MB/s off a typical low-end drive, and only twice that off a high-end server drive. Reading sequentially, you can easily get 100MB/s off a low-end disk. Only... Not if someone is trying to read random blocks off it at the same time.

With the right design - like Apache's Cassandra database, for example - disk I/O can still be screamingly fast. But when you share disk, you're counting on every one of your customers having well-designed software. You might as well count on every one of your customers being an 18-year-old natural redhead with 36-24-36 curves, i.e., unless you live in a cartoon it ain't going to happen.

While SSDs are significantly more expensive, the fact that they don't give a damn about access patterns actually makes it cheaper to build robust storage systems.

Now, for you, this doesn't matter a whole lot. A small, smartly run provider is perfect for you, and on that scale they can actually have an idea of what's going on with the storage systems.

It's the people who are running multiple Quadruple Extra Large EC2 Instances with Cheese - at $1500+ per month plus bandwidth plus storage - who are getting burned, and will keep getting burned until the cloud providers change their approach.

Posted by: Pixy Misa at Friday, May 13 2011 03:11 AM (PiXy!)

Hide Comments | Add Comment

79kb generated in CPU 0.3117, elapsed 0.4734 seconds.
54 queries taking 0.4135 seconds, 380 records returned.
Powered by Minx 1.1.6c-pink.

Using http / http://ai.mee.nu / 378

Wednesday, May 25

Monday, May 16

Sunday, May 15

Thursday, May 12

Praise for Ambient Irony

Contact Support

Contact Pixy

Business News

Search Thingy

Recent Comments

Topics

Monthly Traffic

Content

Categories

Archives

A Fine Selection of Aldebaran Liqueurs

That Ol' Janx Spirit

Mostly Harmless

MuNu Blogroll

Dish of the Day

Feeds