How to rescue a broken disk drive using ddrescue.
After plugging in an old Windows NTFS external disk drive it was clear that the drive had a problem: it took forever to show up in Explorer and all attempts to copy files or open folders failed.
So this was a good opportunity to try out a Linux disk rescue tool: ddrescue. This tool copies data blocks from the damaged disk to a good one but unlike other sector copy tools such as dd it does not fail if it encounters a bad block. Instead, it skips forward a few hundred blocks and attempts to read and copy more data. When it has reached the end of the disk it retries the sections with bad blocks, this time reading block by block to retrieve as much data as possible.
Often, when a bad block is accessed, the disk drive hangs for several seconds. Since my disk drive had so many bad blocks, it took ddrescue about three weeks to complete one pass. During the second pass the disk drive failed for good, no more data could be recovered.
Here is the command line and status output from ddrescue:
The options mean: direct device access bypassing the kernel (-d), write to device instead of file (-f), verbose (-v), read backwards from end of disk to start (-R).
The broken disk is the WD20EADS disk at /dev/sdd, and I want to copy everything on it to a new disk HDS7230BLA642 at /dev/sde. We can see in the status section that there are more than 250.000 read errors, that only 1747 GByte from a total of 2000 GByte could be read, and that not a single block could be read from the disk within the last hour. So it is practically dead.
The rescure status is stored in the file wd.logfile. This file can be displayed visually using with ddrescueview:
All green dots represent successful reads and all yellow dots represent sections with read errors. Normally, during the following passes ddrescue would attempt to re-read the yellow sections and try to find some more readable blocks but in my case the disk is unresponsive and I aborted ddrescue after four or five weeks of run time.
All data that could be recovered is now stored on disk /dev/sde. Never use this disk for any actual data recovery; instead copy its content to a third disk of equal size:
dd if=/dev/sde of=/dev/sdf bs=1M
I then plugged in that disk to my Windows Notebook. Of course Windows complained bitterly about many file system errors, but after another day or two it had successfully finished the disk repair process. The disk showed up in Windows Explorer just fine.
I must admit that I did not expect at all that a file system with that many bad blocks could be repaired, but it worked. Of course, many larger files are corrupt because they overlap with one of the defective areas, but there are many smaller files that sit in the green areas and they are good.
Lessons learned
- Do make backups. Every disk will eventually fail even if it sits in a locker and is not touched for years as im my case (and yes, I have a backup of this particular disk so it was not a loss regardless of the outcome of the rescue operation).
- ddrescue can take weeks or months to read large disks. So be sure you have a backup of your important data in some other place in case you need it urgently.
- If a disk has problems, reading from it will probably damage it even more.
- Large files are more prone to disk errors than smaller ones. It is better to store data in several smaller zip files instead of one large one.
- The NTFS file system is surprisingly robust.
- Make backups!
Related articles
-
Rescue disk with ddrescue — How to rescue a broken disk drive using ddrescue.
-
RAID 1 unter Windows 7 installieren — Jedes handelsübliche Mainboard verfügt inzwischen über einen Onboard-RAID-Controller. Man kann den RAID Controller so konfigurieren, dass alle Daten auf zwei Platten gleichzeitig geschrieben werden. Mit geringem Aufwand erhält man so eine erhebliche zusätzliche Sicherheit gegen Ausfall einer Festplatte.
-
Festplatten mit SMART überwachen — S.M.A.R.T. ist eine Technologie, mit der man den Zustand von Festplatten überwachen kann.
Unter Linux kann die Überwachung automatisch erfolgen. Die smartmontools enthalten für diesen Zweck den smartd: Er fragt periodisch den aktuellen Zustand aller Platten ab, führt Tests durch und benachrichtigt gegebenenfalls den Administrator per Email.
-
Defekte Festplatten finden — Festplatten halten nicht ewig. Jede Festplatte wird irgendwann einmal ausfallen, was im schlimmsten Fall den Totalverlust sämtlicher darauf enthaltener Daten bedeutet.