Friday, July 03
And I've Got WTFF News
Akane, the new super-server that hosts everything mu.nu and mee.nu in over a dozen virtual machine, just died for no apparent reason.
Yeah, we're back. Because I was already awake at 4:30AM because I'd been up all night fixing other problems.
Had to do an index repair on the main posts table; other than that we seem to be okay.
Update: Guess what the cause of the crash was? Go on, guess! You'll never guess!
Oh, you guessed.
Update: It gets better. Apparently I need to back up all my data, rebuild the array, and restore everything again. Why the hell do I have a RAID controller in the first place?
Update: Okay, I think I've got it. The system locked up and crashed, apaprently due to a bad disk. Now it has inconsistent data between the data drives and the parity drive for some part of the secondary volume. That may or may not be a serious problem, and there's no way to tell unless you run the fix process. The fix process may involve completely wiping and rebuilding the array.
So no matter what, I need a complete copy of everything on another server. Which is not a bad thing to have anyway, but it is sixteen million files.
Update: 700,000 down, 15.3 million to go. I have to back up sixteen complete virtual machines, plus the backup directory, plus the mee.nu filestore.
Update: 2,700,000 done. There are 5.7 million files across the virtual machines, and another 9.7 million in the backup volume. But the backup volume doesn't *have* to be backed up, because it's a backup.
So, halfway.
I'll schedule the RAID repair for this weekend, so that if it does end up taking us out, I have the maximum time to fix things.
Comments are disabled.
Post is locked.
Akane, the new super-server that hosts everything mu.nu and mee.nu in over a dozen virtual machine, just died for no apparent reason.
Yeah, we're back. Because I was already awake at 4:30AM because I'd been up all night fixing other problems.
Had to do an index repair on the main posts table; other than that we seem to be okay.
Update: Guess what the cause of the crash was? Go on, guess! You'll never guess!
Oh, you guessed.
Update: It gets better. Apparently I need to back up all my data, rebuild the array, and restore everything again. Why the hell do I have a RAID controller in the first place?
Update: Okay, I think I've got it. The system locked up and crashed, apaprently due to a bad disk. Now it has inconsistent data between the data drives and the parity drive for some part of the secondary volume. That may or may not be a serious problem, and there's no way to tell unless you run the fix process. The fix process may involve completely wiping and rebuilding the array.
So no matter what, I need a complete copy of everything on another server. Which is not a bad thing to have anyway, but it is sixteen million files.
Update: 700,000 down, 15.3 million to go. I have to back up sixteen complete virtual machines, plus the backup directory, plus the mee.nu filestore.
Update: 2,700,000 done. There are 5.7 million files across the virtual machines, and another 9.7 million in the backup volume. But the backup volume doesn't *have* to be backed up, because it's a backup.
So, halfway.
I'll schedule the RAID repair for this weekend, so that if it does end up taking us out, I have the maximum time to fix things.
Posted by: Pixy Misa at
05:10 AM
| No Comments
| Add Comment
| Trackbacks (Suck)
Post contains 329 words, total size 2 kb.
47kb generated in CPU 0.0506, elapsed 0.1345 seconds.
54 queries taking 0.1255 seconds, 336 records returned.
Powered by Minx 1.1.6c-pink.
54 queries taking 0.1255 seconds, 336 records returned.
Powered by Minx 1.1.6c-pink.