Yesterday Amazon’s cloud infrastructure failed. If you use the internet much you probably noticed at least one or two down sites. This raises an interesting question about the potential dangers of cloud computing at the enterprise level: If an oligopoly forms where there are 10-20 large cloud providers in the country, have we solved companies’ individual problems at the price of an exposure to a societal problem?
There’s little doubt that Amazon has one of the most highly available infrastructures in the world. There’s little doubt that they manage to achieve higher availability than just about any company who leverages their services would on their own. I would venture to guess that the internet as a whole has higher average site availability today than it did before the migration of sites in to the cloud. However, we learned yesterday that when a large cloud provider fails, all of the companies that use their services fall simultaneously. So instead of having each company that uses AWS have 3 failures per year (as they would of in pre-cloud days), they only have one failure per year, but they ALL have the same failure. That has the potential to be a completely different, and much larger problem.
The nature of this larger danger is that when all those sites fail at once there’s a cascading affect. It’s not just the directly affected companies, it’s all the companies that rely on them. For example, imagine if PayPal was one of the companies hosted on Amazon’s AWS (to be clear, it wasn’t down yesterday, but imagine). Then all the sites that rely on PayPal for checkout would be down. That’s not too big of a deal though, most companies offer at least two different ways to check out. Here’s where it gets tricky though. If Amazon is bringing down large portions of the internet it’s possible that an Amazon failure takes not just PayPal but a couple other payment clearing services with it. Now we’re talking about taking down the checkout services for companies that have planned to be highly available, companies that don’t even use Amazon. It’s got the potential to noticeably slow the US economy for a period of time.
That’s a scary thought. There’s a simple moral though. Trusting the cloud for SaaS is no different than trusting a server or a datacenter. If a workload is absolutely critical, you need to be sure you are trusting two separate cloud vendors all the way down to the infrastructure. This means insuring that your SaaS vendors don’t rely on the same IaaS or PaaS vendors.