Wednesday, October 23, 2013

Recovering a PostgreSQL Database

A couple of weeks ago I got to pull an all-nighter at the office. This was a first for me at this job but it was necessary. It was also a first in that I lost a PostgreSQL database. Considering I have been working with PostgreSQL for over a decade, that speaks to its reliability as I was always losing Oracle databases when I worked for them.

The culprit turned out to be some maintenance we were doing on our network that caused us to loose connection with our database machines. We had to do a hard reboot on them. Normally that would not cause a problem with PostgreSQL but we were using the ext4 filesystem and that did. We lost our global/pg_filenode.map file which is a pretty bad problem. Searching through the Internet we discovered how bad. Everyone pretty much agreed that we needed to find a backup of that file or all our data was lost.

We searched high and low for a backup of the file without any luck. Our next step was to restore from our nightly backups. Unfortunately our backup had not been taking place and so that was not an option. Our last-ditch effort was to go through the data files and try to recover the information one byte at a time. At least we still had those.

There is a utility for PostgreSQL called pg_filedump that allows you to go through your data files and browse their contents. I located a series of scripts that helps you find the data files that contain table data. We really only needed to rebuild 2 tables and they were not very large.  The pg_filedump program helped us locate the data we needed. It took a while but we were able to get everything back.

There were a number of problems that all cascaded together and caused me and my coworker to stay up all night. While it shouldn't have happened, it did. The trick to getting our data back was never giving up.

No comments:

Post a Comment