To Amazon’s credit, they were quick to acknowledge the problem and get everything working again. However, the outage was significant and it demonstrated the number of well-known applications that are relying on these kind of cloud services. It also showed that while S3 has historically had excellent availability statistics, this outage has made organisations revisit the way that they consume public cloud services.
As with any IT implementation, the availability of services is a trade-off between cost and risk. Traditionally, storage arrays would make use of RAIDed disks to ensure that data would still be available in the event of one or more physical disk failures. With object storage technologies such as S3, data is spread across multiple disks within a particular availability zone to protect against disk failure (and zone failure). The option is there (at extra cost) to take advantage of replication and sharding over multiple AZs (with the same region), or even three-way replication over multiple regions (geo-replication). It depends how important that data is to you. If it is data for long-term retention and infrequent access, perhaps you push it out to S3 IA or even Glacier.
So, IT architects are faced with some interesting decisions: Do they take advantage of these higher-value and higher-availability options? Or, should they think about using multiple clouds from different vendors to spread the risk?
Most people would agree that it is unlikely that Amazon, Microsoft or Google will go bust anytime soon. However, in the ever-changing political landscape around data sovereignty, government access to data, security concerns, etc., there are many considerations. The Big 3 are not the only cloud service providers out there. VMware provides a large ecosystem of vCloud Air Network (vCAN) service providers around the world, iland being one of them. For many organisations there is comfort in the fact that iland runs the same technologies that they are familiar with in their on-premises facilities.
Over the past year or so there has been a new term used in cloud computing, especially in highly-regulated industries like financial services, and that is, “vendor diversity.” Prior to the advent and mass adoption of the public cloud, there were a number of instances where customers were faced with big decisions around their IT infrastructures. In some cases the IT service provider did go bust, and customers had a very short window to get all their equipment out of co-location facilities. In the UK this is sometimes referred to as the 2e2 effect. Another instance is that the IT service providers decided to exit the co-location or managed service business for whatever reason, and again, customers had to make a move much faster and sooner than they’d planned.
So, the guidance that is coming out from industry regulators is quite simple— don’t put all of your eggs in one basket. See, for example, what the FCA has to say on the matter. If you run production services with one provider, I suggest you run DR with another, and maybe your backup with a third.
The consultants behind the Clover Index, the Finance Sector Private Cloud Vendor benchmarking tool, have been highlighting this concern to the market for some years. The Clover Index includes an assessment of a CSP’s ability to provide vendor diversity options to clients in their analysis.
Prior to joining iland, I spent some time as a Cloud Solutions Architect with Microsoft Azure. On one of the projects I was working on, the customer had decided to split their workloads between Azure, AWS and a couple of VMware vCAN clouds, as well as keeping some VMware workloads running in their own data centres. They made the decision from the outset to put in an MPLS cloud that would connect their various office locations with the cloud providers, so high-speed connectivity would not be an issue. The initial requirements were all traditional virtual machine-based implementations using Windows and Linux, and they were building clustered “design for failure” applications that incorporated most of the clouds all the time. Some of the applications even used back-end object storage that stored data in either Azure Blob storage or Amazon S3.
Coming back to iland, we are now working closely with Managed Service Providers who either have their own VMware-based facilities and are using iland for DRaaS and/or backup, or who might be working with another VMware vCAN partner and, again, use iland for DRaaS and/or backup. In this way, the MSP is not tightly coupled to either the production or the DR/backup facility. If the customer choses to part ways with the MSP, all their “stuff’” does not necessarily need to be moved to another location.
In the near future, I foresee situations where an MSP might be managing traditional VM-based facilities in iland or another vCAN partner, while simultaneously having some of the new “cloud native” applications running in AWS, Azure or GCE. As iland runs in carrier-neutral data centres, often the exact same facilities that are used by the Big 3, it is very easy to provide a high-speed cross-connect between iland and AWS Direct Connect or Azure ExpressRoute.
So, in summary, when deploying services to the cloud, explore all the options available. Are you prepared to “double up” or even “triple up” to get the availability you need? Or, are you prepared to take the risk? It’s “just IT” under the covers and things will go wrong— no matter how shiny the marketing looks.
Take advantage of multiple cloud vendors for the reasons discussed, but not to the extent of making things unmanageable.