Saturday, January 23, 2010

Six Myths About Backing Up Your Computer

This is my tale about how I got religion about backups. I played a part in all of the stories below. The amazing part is not that any one or two of them happened around me - the amazing part is that they all happened around me.

Myth #1. I have a new hard drive, my data is safe.

Five years ago I helped a friend replace his hard drive after his original drive failed and he lost everything. So he bought the new drive, spent two or three days reloading everything, and went on his way. Three months later, the new drive failed too, and again he lost everything. My friend is an optimist. He believes he's had all the bad luck he's going to have and he still doesn't do backups.

A few years later, I was overseeing putting a new server in the office and had paid a consultant a lot of money to set it up to our satisfaction. The server was backed up after the OS was installed, but before the consultant made changes. After the consultant was finished, we went home for the weekend. When we came in on Monday, the hard disk had failed. All of the money we paid for the consultant was wasted. Even more embarrassing, the CD for the backup software was sitting on top of the server.

The Reality: Hard drives have their highest failure rate when they are new.

Myth #2. Backing up once or twice a month is enough.

My neighbor conscientiously made weekly backups of his laptop. He had decided that losing a few days of work wouldn't be a big deal. Inevitably, his hard drive died hours before a sizable document was due to his largest customer. He had a four day old backup, but he'd done well over thirty hours of work on the document in the interim. I was able to recover the data with low-level tools, but recovery companies charge hundreds to thousands of dollars for that kind of work.

The Reality: Murphy's Law should never be underestimated. Nightly backups are a minimum, and software for continuous backup is easily available.

Myth #3. Backing up documents to an external drive is good enough.

Two years after I helped my optimist friend above, the primary hard drive abruptly failed in our corporate ecommerce web server. We had a backup of the database and other data, but we still had to reconfigure the operating system from scratch. It took four days to get the server running again, which was basically four days of lost sales. I learned a harsh lesson, since it cost thousands of dollars in sales and made a lot of existing customers unappy.

The Reality: To quickly recover from a catastrophic failure, you must have image backups, not data backups.

Myth #4. Having an image backup on an external drive or a NAS is good enough.

I have now been in two different offices that were burglarized and the computers were stolen. If I'd had a USB hard drive sitting on the shelf, it would have been taken too. Even if you put the backup in the safe, it just takes one flood, fire, tornado, or hurricane to destroy the sensitive platters of a hard drive.

I've also known people who whose office was served with a search warrant. The content on computers is just assumed to be relevant, so officers often take *everything* that looks like a computer or a hard drive - including what's in the safe. If you don't have off-site backups, you're done because you may not ever get that equipment back.

The Reality: You must have off-site backups.

Myth #5. I use RAID hard drives so I don't have to do backups.

My primary computer had mirrored hard drives installed right from the start. Six months later, my computer contracted the second virus I've ever had. However, RAID is protection against disk failure, not against data loss that was intentionally caused by a virus. I recovered from the failure by reloading the entire system from image-level backups, as I described in #3.

As I became increasingly paranoid about hard drives failures, I upgraded our corporate web server with a hardware RAID card and enterprise-grade SCSI hard drives to allow RAID 1 drive mirroring. SCSI hard drives generally cost several times the price of a similarly-sized IDE (consumer) drive because they are expected to be substantially more reliable.

A week after the new hardware went live, I started seeing I/O errors. Our expensive new RAID controller board had malfunctioned. As a result, it corrupted the SQL database that contained the customer support system, which contained FIVE YEARS of customer support information, including an extensive knowledgebase. The primary backup was completely missing due to a configuration error in the backup script. I ended up staying up all night to restore the database from a third-level backup that I'd been paranoid enough to perform.

The Reality: If you don't have 100% redundancy, including spare computers ready to go, downtime and possible data loss are just one hardware failure away. Also, RAID provides zero protection again viruses and user error.

Myth #6. Paying experts to do backups is good enough.

After we upgraded to RAID hard drives, we purchased Managed Backup for our corporate server. This means that experts are responsible for keeping our server backed up, and they are responsible for recovery when things go wrong.

Last summer one of our web server log files was erased accidentally. These files are critical for financial analysis and prediction, so I was pleased that we had the experts managing our backup. Except for one thing. The "experts" considered the log files low value and a waste of space, so the file I needed had not been backed up. Damn.

The Reality: Backups that haven't been verified and tested aren't backups.

Conclusion

You can't be too paranoid about backup. I have four levels of redundancy in my backups, but I still worry that it won't be good enough. If your computer contains critical information, whether it's corporate documents, baby pictures, legal documents, graduate thesis, or financial records, do yourself a favor and back it up, preferably in multiple places.

In Part 2, I'll look at how I've implemented my tiered backup strategy and what software I use to make it happen.