Baby don’t look up… The sky is falling…

September 12th, 2007

… Ok, so not really…  But if you’d checked in with me yesterday evening I might have felt differently…

Short story long:  So midday yesterday I’m at work when I get an e-mail from my friend Scott…

“Um, is there something up with your server, or your internet connection…”

I check, and sure enough, things are running REALLY slowly…  A little more digging by Scott and I reveal that one of my hard drives seems to be dying…

…Unfortunately, when I set the server up, I was greedy, and opted to set the two drives up in a RAID 0 array (which stripes the data between the drives, giving you twice the space) rather than a more redundant RAID 1, which mirrors one drive to the other, so if one fails you don’t, you know, lose everything…

So, I get home and try to pull some of the data off, but it’s very slow going…  I get a few files copied, but then everything grinds to a halt and the server hangs…  So I chance a reboot, but, sadly, the system barely gets started and the kernel panics…  not good…

I try a couple live CDs that I have laying around, but those can’t seem to see the RAID drive…

After a run and some much needed (but probably ill advised) comfort food (um, maybe the run and the double-whopper with cheese cancel each other out?), I tried the drive manufacturer’s disk diagnosis tools, but those were a bust as well.  Basically, all it gave me was “There are too many errors to report, you’re screwed”…

Sigh…

Finally, just before bed, I happened upon some forums discussing how to set up Ubuntu Linux on a RAID 0 array…  Basically, the live CD alone doesn’t have the necessary packages, but once you’ve booted up, you can install the “dmraid” package and use that to make the array available for mounting…

…Sooooo, I bang my head against the wall for about half an hour trying to figure out how to get that to work… the instructions are really simple, but for some reason my drive just isn’t showing up…  Finally I realize that the live CD I have is a couple versions old, and apparently I need something with a newer kernel…

Well, a quick (ok, not THAT quick) download later, I was able to boot into the latest Ubuntu live CD, installed dmraid, and I’ll be danged if all the partitions didn’t show up…  Amazingly I was then able to mount the ones I needed and kicked off a full backup while I got some much needed sleep….

I took a quick look at things this morning and it looks like everything I needed copied over just fine…  I’ll have to do a bit more checking this evening to be sure, but I think I managed to recover everything…

Anyway, for the time being I’m now more or less up and running on my backup server (which is why, if you checked in yesterday, you got the site as it was several months ago)…  I’m still pondering how best to get the real server back up…  I’m guessing that the failed drive will need to be RMAed.  Once it’s back, I’ll probably switch it to a RAID 1 array (250 GB should be more than enough, right?), and then maybe ponder an OS change (I’m thinking CentOS, to match the rest of the TechieNet family?)…

…I also think I need to look in to a routine backup solution…  yeah… :)