Amazon Outage – The Reality of 99.95% Uptime

As many have heard, according to Amazon’s blog post, it was due to a “human error” that on Tuesday, February 28th, that Amazon and over 100,000 of its clients experienced about 4 hours of downtime.  It is no secret the tremendous impact that 4 hours of downtime can have on a company, but today, we at Digital Edge want to focus on industry’s uptime standards, quality of the provided services, and some practical suggestions to clients and colleagues. We want to share our opinion on today's options of cloud offerings and some other available technologies.

Amazon’s AWS (Amazon Web Services) is used as the cloud platform by a plethora of companies, that it’s averaged that Amazon controls roughly 0.8% of all internet businesses. With this much power, industry bloggers and reporters warn that Amazon should be extremely careful when it comes to managing their cloud computing systems. The smallest of mistakes can yield an incredible impact; which is evident from last Tuesday’s outage. Amazon’s blog post faults a typo when trying to execute a simple billing department command that caused a catastrophic chain of events. 

Without criticism or judgments, Digital Edge knows from long-time experience within the industry, that mistakes happen. Because Amazon has a flat offering, they treat everybody the same, whether you are a staging or development system or production system, under Amazon you will be managed the identical way, despite having a mission critical IT system or not. Just imagine for a second, you run a mission critical IT system that has multiple environments which are individualized to your company, under Amazon’s offering, you could be dropped and recovered with the same processes and procedures and non-mission critical IT systems. 

Amazon promises a 99.95% SLA which, by definition, means that they’re promising an allowed downtime of 4.38 hours a year for clients. There are several issues with this promise. Normally, everything is fine when those 4.38 hours are split in multiple smaller outages. Such long single outage feels painful for everyone.  

The next concern is what happens when companies experience a greater downtime than the 99.95% ensures. Companies that were victims of the 4 hour downtime are supposed to receive a grant; however, Amazon makes it very difficult for companies to actually receive this grant. They have strict requirements that if not followed exactly, will prove a company to be not eligible for the service credit.  

“Customers should expect that the likelihood of a meaningful giveback is basically nil.”
- Lydia Leong, analyst from Gartner   

However, despite the giveback, all you really need is UPTIME. That brings the next question – 99.95% SLA for the said amount a good offer for your company? Can your business sustain a continuous 4 hour outage?

“The AWS SLA also has the dubious status of ‘worst SLA of any major cloud IaaS provider’”
- Lydia Leong

Digital Edge believes that there much better offers on the market for the same price. 

Currently, Digital Edge maintains a 100% SLA, with round-the-clock management ensuring your company receives no downtime. Last year we maintained 99.995% and 2015 we ended with 100% uptime. We have 24/7 support detecting any issues before they surface. Learn more about what we offer on our website!

Click here to learn more about Digital Edge’s Cloud Offer. 

Click here for Digital Edge’s Cloud Price Estimate Tool.


Was this article helpful?