There are two sayings you can find in virtually every article about data protection: "If your data doesn't exist in at least two places, then it doesn’t exist," and "If you can't restore your data, then it doesn’t exist." This is for good reason; not testing backups and (believe it or not) not having backups are the two most common business-affecting mistakes that IT teams make.
We should add a third saying: “How do you know you can actually restore your data from your current backups? Have you tried?”
Most people are at least passingly familiar with Erwin Schrödinger’s thought experiment involving quantum mechanics, a radioactive atom, and a cat. The unfortunate feline is neither alive nor dead (more accurately, it’s both) until its state has actually been observed. You can probably see where this analogy is going.
Let’s put a practical (though pessimistic) slant on it: Your organization’s backups are definitely dead unless you’ve observed them being successfully restored to production. No quantum superposition here. If you can’t restore your data from backups, it doesn’t exist.
You need to regularly test your ability to restore from backups. You don’t want to find out whether or not a restore will work when a disaster has already occurred and the business is counting on you. Murphy's Law ("If anything can go wrong, it will") is a terrible force, and none of us should trifle with it.
Let's look at some numbers. A recent Dell/EMC survey (Protect Your Data At Every Point Possible, 2017) found that just 18% of respondents believe their current data protection solution will meet all future business challenges. The same survey reported the average cost of data loss as $900,000, and the average cost of downtime was pegged at $555,000. These numbers aren't out of line with other surveys covering the data protection space, so let's ponder them for a while.
That only 18% of respondents believe their organization's data protection approach is adequate is disconcerting, to say the least. More accurately, it’s terrifying. To look at it another way, 82% of organizations aren't confident that their data protection works.
The other numbers in the survey are an average cost of $900,000 for data loss and $555,000 for non-data-loss-related outages. Now, these numbers should be taken with a grain of salt; the survey covered respondents from organizations of varying sizes. Data loss and outages for larger organizations can be quite significant, and the high-end numbers reported in that survey strayed far from the average.
That said, the costs are worth paying attention to, even for organizations as small as 50 people. It's rarely direct costs that get you—more likely, it is the staffing costs required to solve the problem. If those costs aren’t incurred in the form of hiring someone to make the problem go away, they usually manifest as overtime hours for existing IT staff.
This brings us back to testing your backups (and more importantly, automating the testing of your backups). Consider what success looks like for each backup and workload type, then set up success conditions for that automated testing beforehand.
Consider, for example, the backup of a file share on a file server. There are various possible success conditions. One might want to make sure the directory structure is readable and traversable. One might also want to verify that files are as expected.
One way to do this might be to place test files at specific points in the file structure and verify the version offered by the data protection solution is the same as the version known to have been placed in the file path. You may want to test not only that the primary backup branch is traversable in this manner, but that at least the last three versions are as well. This can help when dealing with some problems, such as ransomware.
Testing workloads is trickier. There’s nothing special about the ability to attach a file to a hypervisor, call it a disk image, and try to get a virtual machine (VM) to boot off of it. Actually having the workload boot off of that image is the important part.
SolarWinds® Backup takes a practical and comprehensive approach to testing. SolarWinds Backup is engineered to attach a workload image to a hypervisor and then try booting it, but to wait until VM activity settles, indicating booting is likely to have completed, and then email a screenshot of that VM's console to an administrator.
This can help you to verify that the backups of bare-metal workloads—which won't have Hyper-V integrations or VMware tools in them—have successfully been able to boot in the disaster recovery environment. Bare-metal workloads are usually the hardest to verify, especially since most disaster recovery options are physical-to-virtual, at least until the backup can be pushed back to the original iron.
The critical part—as repeatedly mentioned here—is that we all test our backups. If this level of repetition is frustrating, or even infuriating, that’s good! It's an important enough topic to be worth a little vexation. Hopefully, it will spur you to go test your backups, or better yet, automate the testing of those backups.
Prevent cruelty to hypothetical animals. In the case of a thought experiment about backups, that cat is very definitely dead unless you can prove otherwise—and so is your company’s data.
For those of you who don't know how to test your backups, or who don't have a data protection solution that can automate that testing, why not give SolarWinds Backup a try?
Carrie Reber is senior product marketing manager for SolarWinds MSP
© 2018 SolarWinds MSP Canada ULC and SolarWinds MSP UK Ltd. All rights reserved.
The SolarWinds and SolarWinds MSP trademarks, service marks, and logos are the exclusive property of SolarWinds MSP UK Ltd. or its affiliates. All other trademarks are the property of their respective owners.