Knowlegde

In-Depth Discussion of Different Data Loss Scenarios

Data Recoverability

In a data loss scenario, the most crucial question is: Are the files still recoverable? The answer prescribes what action needs to be taken, whether to pursue the data recovery, or to develop coping strategies with the data loss.

The situation is often challenging to assess. Sometimes it is not entirely clear what caused the data loss in the first place. A technician might have already worked on the problem, fudging things further. The effect of common remedies, such as Microsoft's "Checkdisk", on the recoverability is unknown.

This article attempts to sort out what is possible and what is not. We will try to explain, for a given scenario, what you can expect. We will limit the idea of recoverability to commercially available and affordable data recovery. While the magnetization that once constituted the data might still be present on the media, the technology to recover this data for an economically reasonable price might not.

The Success Rate Depends on the Type of Data

We have to consider the kind of data that needs to be recovered. Assume you can recover a hypothetical 90% of all lost files. If these files were photos, you could consider this percentage a success; you got 9 out of 10 photos back. If your files were database tables and 10% are missing, the entire database will probably be worthless because the data relies on each other. The more interdependent the data, the higher the damage will be if even a small percentage of data is missing. We will also look at what statements such as "90% recovered" really mean. Another interesting aspect will be the "time dimension": a data recovery outcome is usually worth less with each passing day or hour.

Physical and Logical Data Recovery

We want to distinguish between two very different procedures:

  • Physical data recovery: Extracting the raw data from the affected media

  • Logical data recovery: Reconstructing the files 

You can have pure logical data losses. For example, file deletion, drive formatting, or virus attacks require logical reconstruction only. On the other hand, a mechanically failed drive that is successfully repaired will not need logical reconstruction.

In reality, many physical problems will subsequently need logical reconstruction because not all data could be retrieved.

Dead Hard Drive

A drive can be considered "dead", if it is not accessible by any software means, e.g., the BIOS, Windows' Disk Management or disk utilities such as Runtime Software's GetDataBack. A dead drive often shows additional symptoms. It does not spin or it "clicks", or it makes other kinds of strange noises.

These drives might have a damaged electronic board, damaged read heads, a damaged motor or damaged magnetic media. Data recovery companies with cleanroom facilities can often resurrect the drive by exchanging the broken parts. They will then image the drive and perform a logical file reconstruction.

This approach is sometimes successful and then well worth the cost of several hundred or even thousands of dollars; however, often enough, it is not successful.

Physical Recovery is Not Always Possible

First of all, success depends on the extent of the damage. It is not possible, even in theory, to recover data from a platter that was heated up to "Curie temperature" (which is 770°C for iron). This temperature completely demagnetizes the platters. It seems doubtful whether anybody will recover data from a drive that fell on a hard floor. If the platters became unbalanced due to bending or impact, they would vibrate while spinning. Suppose this vibration's vertical amplitude is larger than the distance the read head flies at (50µm). The drive will sustain a permanent head crash, making reading the magnetic information impossible and further destroying the surface. Horizontal vibration will make it impossible for the head to stay on the track, which is thinner than 1µm.

While we know that tire shops apply weights to the wheel to balance the tire, a comparable technology for unbalanced platters is unknown.

The only technology possibly capable of overcoming this problem is Magnetic Force Microscope (MFM) photography, since this technique does not require the platter to spin. However, MFM requires scanning the whole surface of the platter. The MFM moves from region to region, each region yielding a picture. This process alone will take several months. Then all these pictures must be stitched together. A 20 GB hard drive consists of 160,000,000,000 bits, probably 300,000,000,000 bits including overhead. A magnetic flux change represents each bit. A picture displaying this flux change will probably use 100 bytes, inflating each bit by factor 1000. You will have to analyze the amount of 40 Terabytes of data. It is unknown if this technology is in use. It certainly is not "commercially available and affordable" because a data recovery would cost hundreds of thousands of dollars.

Success also depends on the drive type. Many data recovery companies can "do" specific drives but cannot do others. Modern drives are conditioned after their assembly to work perfectly with the parts built in, heads, platters etc. It is often impossible to use parts of another drive, even if both drives share the same model number.

There are no "magic" machines that are capable of recovering the data from any kind of drive. If the raw data can be retrieved, a subsequent logical reconstruction of the files must be performed.

Drives With "Bad Sectors"

These drives are still recognized by the BIOS or software such as GetDataBack, but they have read errors in one or more spots. After obtaining a drive image, you will need to reconstruct the files from this image with GetDataBack.

Create an Image

You can create an image with disk utilities, such as Runtime's DiskExplorer or GetDataBack. You should not try to make the image, if the drive makes unusual noises as the process of creating the image can further damage the media. Instead, you might then give it to a data recovery service company.

Of course, the question is what data recovery service companies will do other than trying to create an image. It would help if you asked them. Finally, it is your own decision if you want to try it yourself.

Before you begin, you should be well prepared. You should have the imaging software installed on a working computer and know how to use it. You should have sufficient hard drive space to hold the image of the bad drive. You should be focused on this task and observe its progress. It would be best if you did not do anything else on this computer simultaneously, such as playing games or surfing the internet. It is difficult to predict how long it will actually take to create the image. It depends primarily on the number of bad sectors on the drive and can take 30 minutes up to several days.

Attach External Drives Directly to Your Motherboard's SATA Port

If you have an external drive, for example, a USB drive, you should remove it from its case and attach it to the computer's SATA cable as an additional hard drive. Once you have obtained an image, you can run GetDataBack and recover the files.

Logical Reconstruction of a FAT-Formatted Drive With GetDataBack

DiskExplorer for FAT: Directory entry for file IMG_2379.JPG

DiskExplorer for FAT: Directory entry for file IMG_2379.JPG

GetDataBack scans your drive or an image, and tries to put together the original state of all files in the file system. GetDataBack can do this, even if some file system structures are missing, such as partition table or boot records.

Let's examine in detail how GetDataBack recovers a file:

A file in a FAT file system is completely described by

  • its directory entry *),

  • its entry in the File Allocation Table*) (FAT), 

  • the allocated clusters*) containing the content of the file. 

*) The colors above correspond to the region colors in the graphic below.

File in FAT: FAT chain, directory entry, and allocation

File in FAT: FAT chain, directory entry, and allocation

The directory entry is picked up during the drive's initial scan when GetDataBack examines each sector. It contains the file's name, size, date, time, and the first cluster of its data.

The first cluster directly points to the initial cluster allocated for the file. It also points to a FAT entry that describes the clusters containing the remaining parts of the file. As it turns out, IMG_2379.JPG uses the clusters 4-529.

Information about a file in a FAT file system is spread among three different locations. The directory entry contains its name and where on the drive the file begins. The FAT knows where the file continues. Finally, the allocated clusters contain the file's content.

GetDataBack uses this information to reconstruct the files. Because information about a file is stored at three different spots, it will cause problems if any of these are missing or incomplete.

FAT Recovery Matrix

The following matrix informs you about a file's recoverability depending on the presence of allocation, directory entry, or FAT.

Alloc Dir FAT Recoverability
File will be recovered perfectly
File will probably be recovered. Problem with fragmented files.*
File has no name. It is possibly recoverable as a "lost file".**
File is not recoverable, although its file name can still be seen.***
File is not recoverable. No trace of its existence is left.***

This information is available 

This information is not available 

* Fragmentation

A widespread situation, caused by file deletion, format or partition deletion is a missing FAT entry. As long as the file size is smaller than the cluster size (e.g., 32 KB, depending on the drive size), you will get a perfectly recovered file because you do not actually need the FAT entry.

If the file is larger, it is usually allocated in consecutive clusters. Therefore, the most promising data recovery strategy is to assume continuous clusters when rebuilding a file without FAT entry. This method works for most files but runs into problems for files that increase over time. These files will necessarily be fragmented if they cannot be allocated consecutively because other data meanwhile use these clusters. Sadly, many important files fall into this category: Email files, databases, large documents, and directories.

GetDataBack employs several techniques to recover even fragmented files correctly. These techniques include considering the allocation of other files. GetDataBack also is capable of reassembling fragmented directories. But make no mistake; these efforts are doomed to fail for large and heavily fragmented files.

As annoying as it is, although their content is still somewhere on the drive, these files are unrecoverable.

There is no automated data recovery software available that can solve fragmentation satisfactory. If you want to recombine a file consisting of 10 clusters on a 20 GB drive, you must analyze, given a cluster size of 32 KB, all possible combinations of one known cluster with 9 other clusters out of possible 625000. These are 625000^9 possible combinations, a number with 52 digits.

The only possible and more intelligent approach is a "manual" data recovery for a particular file. With Runtime's DiskExplorer, you would begin at the known cluster and search downward, looking for data that you know belongs to the missing part of the file. Finally, you put all your findings together into a new file. The limitations of this approach are apparent. This can only be done for a couple of files with a known content.

Even data recovery service companies will most likely not produce better results. While they might have a couple of tools, e.g., for extracting readable text, they do not have your knowledge about the content of the file.

** Directory Entry Missing (Lost Files)

If the directory entry was lost, but the file content is still on the drive, you might recover the file if you knew where it is located. This problem is different from the fragmentation problem. You do not know the name, size, and start of the file. This loss happens if the operating system re-used the original directory entry while the file content was left unchanged. 

If you formatted a drive and put gigabytes of a new Windows OS on it, the start of the old file system, including a lot of directory information, would be destroyed. In contrast, the files themselves might still sit in locations beyond the overwritten part.

A recovery software for these unreferenced "lost files" needs to scan each sector of the drive and compare its content with a list of known file signatures. Other problems arise, such as deciding on the length of a file after the signature list identified it.

In our experience, lost file recovery is a painstaking and lengthy process with often dubious results. You end up with vast amounts of unnamed files of unknown and very often corrupted content.

*** File's Allocation Was Overwritten

If the file's allocation had been destroyed or overwritten by other data — as in the four bottom cases of the recovery matrix — there is no possibility at all to recover this file. Once overwritten, it is unfeasible to retrieve the information that was initially being stored there. Theoretically, you might be able to read the "rest magnetization" with an advanced technology such as MFM (Magnetic Force Microscope), but it is unknown if anybody can actually do this. Certainly, if this technology does exist, it is not "commercially available and affordable".

No data recovery software and no data recovery service company will be able to recover this file, although you still might be able to see its file name in GetDataBack.

Logical Reconstruction of an NTFS-Formatted Drive With GetDataBack

DiskExplorer for NTFS: MFT entry for file IMG_2379.JPG

DiskExplorer for NTFS: MFT entry view

As we will see, NTFS is a better file system when it comes to data recovery. Usually, there is NO problem with fragmentation.

Let's examine in detail how GetDataBack recovers a file:

A file in an NTFS file system is completely described by

  • its MFT entry *) (Master File Table),

  • the allocated clusters*) containing the content of the file. 

*) The colors above correspond to the region colors in the graphic below.

Runtime Live CD

File in NTFS: MFT entry and allocation

The MFT entry is picked up during the initial scan of the drive when GetDataBack examines each sector. It contains the file's name, size, date, and time. Other than the directory entry in FAT, it also includes the complete list of used clusters, called run-list.

The run-list directly points to the file's allocated clusters. It turns out, IMG_2379.JPG uses x1F5 (501) clusters beginning at cluster x3F02 (16130).

We see, information about a file in NTFS is spread among two different locations. The MFT entry contains the file's name, and the run-list describing the allocated clusters. The clusters themselves contain the file content.

GetDataBack uses this information to reconstruct the files. Note that in NTFS, other than in FAT, we do not have a fragmentation problem. As soon as there is an MFT entry, we exactly know where the file is allocated. That will yield better data recovery results for fragmented files. 

NTFS Recovery Matrix

As information about a file is stored in two different spots, it will cause problems if any of these are missing or incomplete. The following matrix informs you about a file's recoverability depending on the presence of MFT entry or allocation.

Alloc MFT Recoverability
File will be recovered perfectly
File has no name. It is possibly recoverable as a "lost file".*
File is not recoverable, although its file name can still be seen.**
File is not recoverable. No trace of its existence is left.**

This information is available 

This information is not available 

* MFT Entry Missing (Lost Files)

If the MFT entry was lost, but the file content is still on the drive, you might recover the file if you knew where it is located. This loss happens if the operating system re-used the original MFT entry while the file content was left unchanged. 

If you formatted a drive and put gigabytes of a new Windows OS on it, the start of the old file system, including a lot of MFT entries, would be destroyed. In contrast, the files themselves might still sit in locations beyond the overwritten part.

A recovery software for these unreferenced "lost files" needs to scan each sector of the drive and compare its content with a list of known file signatures. Other problems arise, such as deciding on the length of a file after the signature list identified it.

In our experience, lost file recovery is a painstaking and lengthy process with often dubious results. You end up with vast amounts of unnamed files of unknown and very often corrupted content.

** File's Allocation Was Overwritten

If the file's allocation had been destroyed or overwritten by other data — as in the two bottom cases of the recovery matrix — there is no possibility at all to recover this file. Once overwritten, it is unfeasible to retrieve the information that was initially being stored there. Theoretically, you might be able to read the "rest magnetization" with an advanced technology such as MFM (Magnetic Force Microscope), but it is unknown if anybody can actually do this. Certainly, if this technology does exist, it is not "commercially available and affordable".

No data recovery software and no data recovery service company will be able to recover this file, although you still might be able to see its file name in GetDataBack.

Data Recovery From an Image After a Physical Problem (Bad Sectors)

When you run GetDataBack on an image obtained from a physically damaged drive, you will usually get good recovery results, assuming this image contains only "some" unrecoverable sectors.

Several factors contribute to this optimistic outlook:

  • A drive with bad sectors is usually not altered too much by the user's attempts to "fix" the problem.

  • If it is a FAT drive, the file allocation table and its copy are still there to be used by GetDataBack.

  • Most file system structures are available.

Of course, success depends on your ability to obtain this image. Files that were allocated in the damaged portions will be damaged after the recovery as well.

Data Recovery After Deleting or Recreating a Partition

When you delete a partition, only the partition table and the boot record are affected. Important structures, such as MFT and FAT, are usually undamaged.

Even recreating the partition — as long as you do not format the volume — should not alter important data structures.

With GetDataBack, you should be able to perform an almost perfect data recovery.

Data Recovery After Formatting

In FAT, formatting a volume clears both file allocation tables and deletes the root directory. Many files are still there, but you have lost:

  • All entries in the root directory: Files can only be recovered as "lost files". Subdirectories of the first level will have only numbers instead of their original name. Subdirectories of deeper levels show their original name.

  • The file allocation tables: This will cause the "fragmentation problem" discussed in the chapter "Logical Reconstruction of a FAT-Formatted Drive With GetDataBack".

Within the limitations above, you will get a "fair" data recovery. Most files should be uncorrupted. You will need to look for your files in the numbered directories. Fragmented files, such as Outlook email files or databases, will be corrupted and probably unusable.

In NTFS, formatting a volume creates a new MFT. However, this affects only the first 25 or so entries. It usually does not touch the MFT entries of previous user files.

That means you can expect a "good" data recovery. Almost all files should be correctly retrieved.

Your results will be even better when you formatted a previously FAT-formatted drive with NTFS or vice versa. In this case, the original FAT or MFT will probably not be damaged because these structures are located in different areas on the drive.

Data Recovery After Installing a New Windows Operating System

Here's where the trouble really begins. Installing a new OS can easily overwrite 10 GB or more.

All files located in these 10 GB will be irrevocably lost. Also, directories entries (FAT) and MFT entries (NTFS) located there will be lost, leaving files without reference ("lost file"), even if they are located beyond 10 GB.

In FAT, this will also destroy the FATs, thus causing fragmentation problems.

As explained before, a technology capable of recovering data from "rest magnetization" is not commercially available. All you possibly can recover will come from the not overwritten area.

Damage Assessment

Example Suppose you initially had a 20 GB FAT-formatted hard drive with 10 GB used for 50000 files in 2000 directories.

You put a new OS of 2 GB on that drive.

  • You lose 10% of all data on the drive (2 of 20 GB).

  • You lose 20% of your data on the drive (2 of 10 GB, assuming your data is concentrated on on the first 10 GB of the 20 GB drive).

  • You lose 20% of your files that were allocated in the overwritten area.

  • Because most of the directory entries are located in the first 2 GB, you lose an additional 30% of your files ("lost files").

  • You lose an additional 10% of all files due to fragmentation and overlapping between the two areas. 

You will be able to recover just about 40% of your files undamaged. The other 60% will be damaged, lost files, or not recoverable at all.

If your files on the drive were "depending" on other files, e.g. tables for databases, this number drops even further: 

  • If your files depend on each other pair-wise, e.g., you have one Word document for "Contracts" and one for "Appendixes", only 0.4*0.4*100 = 16% of all pairs (projects) are left.

  • If you have projects of 5 files each on the drive only (0.4)^5*100 = 1% of these projects are recovered without damage.

Important Note that in the example above, losing 10% of the raw data can cause the loss of 99% of your projects.

If you are dealing with a previously NTFS-formatted drive your prospects are brighter:

  • You will lose less files due to fragmentation and overlapping, only 5% instead of 10%.

  • You also will lose less MFT entries than you would lose directory entries in FAT, 10% instead of 30%. NTFS tends to spread the MFT across the drive.

You would recover 65% of your files undamaged.

You would recover 42% of your 2-file projects, almost three times more than with FAT.

You would recover 12% of your 5-file projects, twelve times more than with FAT. 

Data Recovery After Imaging or "Ghosting" a Drive

The consequences of imaging over a drive, for example, with Norton's Ghost, are similar to the ones you face after putting a new OS on it. If the image was quite large, chances that you will recover a lot of files are pretty slim.

Data Recovery After Deleting Files

Although seemingly easy, recovering deleted files can be more complicated than recovering files from a bad sector drive, or after an Fdisk or format.

File deletion is the least understood topic. Ironically, what makes data recovery of deleted files so hard is the fact that the user can still work with his drive. His attempts to recover the just deleted files often ruin his chances.

Let's have a look at how the operating system deletes a file.

File Deletion in FAT 

A single file gets deleted by

  • marking its directory entry with E5,

  • freeing the associated FAT entry.

Whole directories are deleted by 

  • marking the directory's directory entry with E5. The directory entries of the files inside the deleted directory are usually left unchanged,

  • freeing the FAT entries for both the directory and the files inside.

After deletion, there is a possible fragmentation problem because the allocation information stored in the FAT, is irrevocably lost.

File Deletion in NTFS

A file gets deleted by flagging its MFT entry as unused. The MFT still contains the file's allocation. The file is, therefore, easier recoverable than its FAT counterpart.

The Recycle Bin 

What we described above applies to the "permanent deletion" of files. If you do not delete them permanently, they are moved to the "Recycle Bin" and recovered from there.

While moving the files to the Recycle Bin, they get renamed (for whatever reason) to numbers while keeping their extension. For example, My vacation.doc will get a new name like d24.doc in the Recycle Bin folder. These internal details do not matter as long as these files are still in the Recycle Bin. The OS will provide you with the correct name when you choose to undelete these files.

If you "empty" the Recycle Bin, however, the deletion processes described above are carried out for these renamed files. If you later wish to recover My vacation.doc, you will actually have to look for an unknown file name with the extension doc.

Chances For Successfully Recovering Deleted Files

The locations of the deleted files are not protected by the file system anymore. Those locations might be recycled the next time the OS creates a new file. That's why it is such a problem if the user continues to work with the affected hard drive.

Files are created all the time. Processes write log files, printers queue print jobs, the internet browser creates lots of temporary files. Even booting and running Windows from the affected drive can overwrite the critical areas.

To protect those deleted files, the user must stop working with the drive immediately and connect it to another computer as an additional drive.

We've tested how long a deleted file was recoverable before the OS recycled the deleted file's directory entry, MFT, or allocation. It happened almost instantly. That leads us to be very pessimistic about the prospects of recovering "a couple" of deleted files.

Whereas, if you deleted — let's say 1 GB consisting of 1000 files — and do not continue to work with this drive, chances to recover most of these files are pretty good. If you work with FAT, you will possibly face a fragmentation problem.

Recovering Data In Time 

The time dimension often gets underestimated when it comes to data recovery. Losing data for a week can be as bad as losing the data forever.

That emphasizes the importance of a data recovery software like Runtime's GetDataBack. Appointing a data recovery service company will always have a turnaround time of several days. Doing it yourself requires a little preparation, and after a couple of hours, the data recovery is completed.

While you have to seek professional help for physically damaged drives, most data losses are logical file system corruptions. The recovery services won't use better tools than you can use.

© 2021 Runtime Software