EXT3 Undelete

August 9, 2007

Today, against everything I have learned over the years, I managed to delete all of the work I did yesterday and I didn’t have a backup. Normally, this would only be a big problem, but not necessarily a catastrophe, because of the availability of so many undelete utilities (even for Linux). However, it turns out that there is no good way to recover deleted files from an ext3 partition because of the way it carries out an unlink. After some searching, I finally found a solution and managed to recover my files.

First, let me say that the easiest way to avoid this problem is by scheduling a daily backup. I know this, and I always do this, but today several small problems all converged. First of all, I recently upgraded to the latest version of Ubuntu Linux. I have had an automatic backup script in place for years now, but after the upgrade I have been lazy about setting it up again. Thus, my backup has not been running lately.

This wasn’t a huge problem in my mind because I have another level of redundancy keeping my files safe in that I synchronize the data for my current working projects every day on my desktop and laptop via a USB flash drive. I run a small script when I begin working and after I finish which synchronizes everything to the flash drive. However, earlier this week I tweaked the script to allow me to synchronize over the internet using rsync. In doing so, I inadvertently broke the script so that it wasn’t properly updating the USB stick.

When I came in to the office today and synchronized, I saw all of my work from yesterday being deleted! Thus, my flash drive was not up to date which, when synchronized to my hard drive, caused everything new to be deleted. Plus, my backup was not working! So, basically, my two safety nets had both failed. I was lazy about fixing the backup and so I was left with only one layer of redundancy, which, because of a stupid scripting mistake also failed.

Typically, there would have been a third layer of protection in some kind of undelete utility. Unfortunately, undelete on an ext3 filesystem is theoretically impossible because of the design of the filesystem itself. Luckily, the main files I lost today were Fortran code, plain text files. I also lost some binary files, such as an SQLite database I had updated and a few new GnuCash transactions, but they were much less important.

So, to recover plain text data one can simply do a brute force grep of the filesystem to find a unique string in the deleted file. My Fortran files all contained a unique string, the name of a module I wrote called “auxpf_filter.” Thus, using the following command I was able to eventually find the files I needed.

grep --binary-files=text -300 "auxpf_filter" /dev/hda4 > output.txt

A few notes: It is best to unmount the filesystem containing the deleted files first, so you do not risk overwriting them with new data. The -300 option tells grep to report the 300 lines before the string “auxpf_model” and the 300 lines following it. There will be a bunch of garbage and hopefully some snippets of the file of interest in the file output.txt. It is possible that the file may have been split into several parts on the filesystem. You may have to do this several times with different strings to get the entire file. Also, there is a real risk that you may not be able to recover the file at all if it has been overwritten already. I was lucky enough to be able to recover everything but your mileage may vary.

I immediately reinstated my backup script. It’s amazing what a little motivation can do.