rickshangle.com

July 21, 2006

They always know

Filed under: Apple, Data Control — rshangle @ 5:32 pm

They being hard drives, the thing to know being when to fail in order to cause maximal TFDL[1].

I don’t mean to assign hard drives some unwarranted malevolent intelligence — it’s always a bad time.

expd dell

Depending on the level of rigor involved in a data backup process, a lost drive is (for most home users; I’m talking about for human beings, not companies - that’s another story) a time-consumer as you replace the drive and recover the data.

If you’re not so on top of the data protection thing, then the impact can range from inconvenient (time lost reloading everything / recreating lost data that is re-creatable) to expensive (lost music from iTunes music store) to catastrophic (lost pictures from honeymoon that existed only on that drive; just a random example).

Given that I spend half of my professional life thinking about data protection (and the other half thinking about storing/managing data before it gets lost, then recovered; hooray!), it would be a cobbler’s-kid-foot-bare-thing if I didn’t have nominally effective practices in place for protecting critical data. Since my office is my home, this really applies to everything. 2x.

And trust me, I do have policies, and procedures. They’re not as robust as the ones I build and sell to companies, but I don’t have hundreds of thousands-to-millions of dollars to spend on data protection, either.

So I’ll tell you a secret, for free.

Every ten minutes, my PowerMac G5 looks to see if my PowerBook17 is on the network, after determining that the process I am now describing is not already happening.

If G5, which is called trogdor finds the laptop, which is called sm, then it connects to it and starts copying work-related files that have changed (since trogdor is really the main repository of all data, and sm only “checks stuff out”, and brings it all back home.

Trogodor also has about 1.5TB of disk space on it, and sm only has 80GB. Which is another reason why this is this way.

The system works pretty well. Mostly.

Which is why when the hard disk on my PowerBook17 began failing in the middle of a data migration process (getting everything off it in preparation of wiping it and giving it to my wife, my new (and awesome) Mac Book Pro having arrived the day before), I wasn’t particularly concerned about the work-related data on it. Or my writing, or receipts related to stuff I bought on the web, or a ton of other crap. It was all backed up to the mighty trogdor. On more than one physical drive, in some cases.

What I didn’t back up, though, because I am a f’ing idiot, are the honeymoon-related photos that have been sitting on the PowerBook, and only the PowerBook, for the last month or so.

Now, normally all digital photos in the house are imported through my G5 (trogdor), where there is a policy to (you guessed it) back the photo repository up to two separate hard drive automatically, daily.

The whole honeymoon photo thing… the photos were imported into the PowerBook (mid-honeymoon… mid-ocean, actually: we were on a cruise, as I’ve been very slowly describing in these pages; the PowerBook was with us, and the G5 obviously wasn’t)… the PowerBook came home, my wife began immediately turnings the imported photos into books within iPhoto, yada… it just sort of… getting them into the G5, and therefore into the backup policy, just didn’t happen.

So now, since there is a set of data of immeasurable value on this failing disk, I am faced with doing something I would normally never do: roll up my sleeves and become sort of computer forensics-ist in order to recover this, urp - the pain, priceless data.

Some people would be thankful that their work (work-work and non-work-work, i.e. hobbies) stuff is safe and sound. I am not one of those people. I am an animal — a data animal.

What would I do normally, you ask, were all the data confirmed safely backed up?

1. Computer to Apple store
1. “AppleCare. Hard drive. Replace. Ball peen hammer.”
1. fin

Why didn’t I, say, at least post the honeymoon pictures to a web site or something? I messed up. Simple as that.

Moving on / taking action

Ok, on to coaxing data off a dead/dying hard drive that wants to take the data with it to Davey Jones’ (Casey Jones’? Who was in the Monkees?) locker. Things of note:

1. The drive, which is a Toshiba 80GB laptop drive, is not mountable in the Finder. It has a journaled HFS+ partition. That’s it.
1. The drive won’t pass a fsck_hfs (broken sibling link), which means that it’s not going to pass Disk Utility, which is basically a shell into fsck
1. DiskWarrior doesn’t see it. TechTool Deluxe doens’t see it.
1. The finder does indicate, when i put the laptop into Target Disk Mode and plug it into my G5, that an unrecognizable disk is now on the system, what should it do? I tell it to ignore.
1. The drive is not making any grinding / cackling / drooling sounds inside the case; in fact, that whole side of the laptop (the left side) is rather cool, which leads me to believe (at least part of the time) it’s not even spinning up.
1. S.M.A.R.T. (drive auto-diagnostics) on the drive indicate a status of “failing”. No sh*t. That technology sells itself.

Point 4 above qualifies as a very, very faint heartbeat on our patient, so I’m willing to take a (benign) whack at recovery before I send the laptop off to DriveSavers or the like to do their expensive magic.

Since all the jelly-coated Apple tools (provided and 3rd party) that I’m aware of either ignore this drive or can’t do anything with it in its current state, I need to see if there’s any way I can get the data off, on to a more stable (i.e. not-failing) media, for further analysis.

What will I do with the data then, since it looks like at a minimum that the directory on the drive has been cooked? I really don’t know, but we’ll worry about that later. Now, I want to get the data off the drive. I just want it off. It could turn out that the current state of accessible 1s and 0s on the drive is worth a pile of steaming excrement. I just want it off. The warp core is breaching; all the crew on the cruiser may already be dead, but we’re beaming them off. Just in case. I can’t believe I just used a Star Trek analogy; you can see what the stress of this situation is doing to me.

fsck says chunks of the filesystem directory are toast. This means tools. Real tools. Man-tools: block-level tools. Thank dog OS X is UNIX.

Some people have been saying good things about GNU ddrescue, which (from what I can tell) is basically the UNIX dd command combined with some sort of retry&log system, so that when used with failing drives, when the program (inevitably) experience failures during a dd copy process, the log can be used to determine the point of failure and pick up at that point after the drive has been reset / given smelling salts. A checkpoint, if you will.

Whatever. I’m not one to look too deeply into things before charging ahead. I download and compile ddrescue.

>trogdor-5:~/Desktop/ddrescue-1.2 rshangle$ sudo ./ddrescue -v /dev/disk6s3 “/Volumes/Backup0000-0000 1/MyVolImage.dmg”
“/Volumes/Backup0000-0000 1/MyVolRescue.log”
>Password:
>
>
>About to copy an undefined number of Bytes from /dev/disk6s3 to /Volumes/Backup0000-0000 1/MyVolImage.dmg
> Starting positions: infile = 0 B, outfile = 0 B
> Copy block size: 128 hard blocks
>Hard block size: 512 bytes
>Max_retries: 0 Split: yes Truncate: no
>
>Press Ctrl-C to interrupt
>Initial status (read from logfile)
>rescued: 7222 MB, errsize: 914 kB, errors: 5
>Current status
>rescued: 19471 MB, errsize: 914 kB, current rate: 3080 kB/s
> ipos: 19472 MB, errors: 5, average rate: 3420 kB/s
> opos: 19472 MB
>Copying data…

Ok, I know that looks… marginally allright and sort of depressing at the same time. It’s copying the data directly from the drive at the block level (bypassing the filesystem) and dumping what it finds into a file called MyVolImage.dmg.

What you don’t see is that periodically the drive just hangs/locks (as failing drives are wont to do), which means I need to control-c on ddrescue, shut down the laptop (which is in target disk mode), wait a bit, turn it back on (with T held down, for target disk mode), re-attach it to my G5 (where this is all running), and then restart ddrescue. Well, restart is the wrong word — continue. Remember that log I spoke about.

So far we’ve hit about 5 (as you can see) non-recoverable drive errors in 20 or so GB. It’s a pain in the ass, but if I have to experience 4x as many on the path to getting these pictures (if nothing else) off the drive, it will be worth it.

There we are for the moment - we are in scanning mode. ddrescue is churning away. Stay tuned.

and, if this works out, i’ll post all the honeymoon picts to the site. promise.

[1] total f****ng data loss
[2] and, in parallel once that’s done, I want to take the powerbook to the apple store and have a genius replace the drive so I can get to the business of getting the machine operational again

2 Responses to “They always know”

  1. ken Says:

    Ouch,

    Good luck. I’ll be interested in hearing how things turn out. I have yet to have to recover a dead drive, but I have a friend who is need. They just moved from MD to SC and his PC died and lost of the picture they had taken during their last six months.

    I haven’t looked (yet), but if you haven’t shared your backup procedures and/or scripts that would be a post I’d be interested in as I’m working on getting my backup environment set up.

    Thanks and good luck,

    –ken

  2. rshangle Says:

    Thanks, man. I’ll post the scripts soon. They’re pretty low-tech (rsync).

    ddrescue has finished, for what it’s worth. Next step - buying a new hard disk to be the data transplant recipient, so (hopefully) programs like DiskWarrior and/or Norton can make something of the scrambled data I just (mostly) copied off the failing drive. r

Leave a Reply

You must be logged in to post a comment.