Measuring the cost of downtime

Danny Bradbury

Are you responsible for a company’s IT infrastructure? Did you sleep well last night? Well, here are some numbers that might stop you from sleeping quite so well tonight.

Datacenters often guarantee how much uptime they’ll give you in terms of numbers of nines. A common one is ‘five nines’ of uptime, which means that they guarantee to run your apps 99.999% of the time (this metric counts digits on both sides of the decimal point).

There are 525,949 minutes in a year. 99.999% of that is 525943.74. Subtracting one from the other gives you a difference of 5.26 minutes. That’s 5 minutes and 15 seconds of allowed application downtime each year.

That might not seem like much, until you realize that the cost of unplanned downtime can run into thousands of dollars a minute, which is the other number in our equation.

Just how many thousands of dollars? Gartner puts it at about $5,600 per minute, but admits that there is a “large degree of variance”. That figure reflects the cost reported in a 2010 Ponemon Institute and Emerson Network Power report. In 2013, though, the companies updated this with a second survey, which bumped up the cost by 41% to $7,900 per minute. That’s right: the cost of downtime is going up.

Contributory factors

What are the biggest costs associated with unplanned datacenter downtime? Business disruption was by far the most significant, with a total of $238,717 lost across 67 datacenters. This includes customer churn and reputation damage, said the survey, which splits out the $183,724 incurred in the second largest cause of downtime: lost revenue. We think these two are closely related, as disruptions to business naturally cost revenue.

The other large cost was a loss in end-user productivity, which cost those 67 datacenters $140,543 between them in FY 2013. After that, the individual costs got far lower. Datacenters had to contend with the following in size order: IT productivity, detection, recovery, equipment repair and replacement, post-downtime activities and the cost of hiring third parties to sort out the mess.

What affects the cost of downtime? A big factor is the extent of an outage. Datacenters aren’t always unlucky enough to suffer a complete outage. Some apps and services may still work. A partial unplanned outage costs on average a little over a third of a complete failure.

The other contributing factor is the length of the outage. Complete failures last 63 minutes longer than partial ones.

It’s just as important to understand the causes of that downtime, as it is to know the consequences. Good old-fashioned UPS failures laid systems low a whopping 24% of the time. Over one in five outages (22%) were caused by human error, and DDoS attacks took out app availability 18% of the time. Other common causes were weather issues, water, heat or cooling failures, generator failures, and IT equipment failure.

Oh, by the way, here are some other things to consider. Five-nines datacenters aren’t always a certainty. You should know whether your own in-house IT infrastructure or tier 3 hosting facility offers this level of availability, or perhaps four nines (52 minutes of downtime a year) or one nine (which would allow systems to stop working 8.7 hours a year).

Secondly, when those downtime minutes occur matters a lot. If you’re an online retailer, and they occur in mid-December, it’ll cost you an awful lot more than at other times of the year.

Business continuity should therefore be an important part of any IT department’s strategy. Having systems in place to minimize human error, back up key data and provide cloud-based alternatives to your own critical IT operations can help to reduce the risk of key system failure.

At least then, you might get some shut-eye. Sorry about that.