Managing IT Risks in a Cloudy World – Take away from Amazon recent outage
Several popular websites and companies were impacted by the recent Amazon cloud outage. It was quite surprising to see that so many of the companies had no backup plans to restore their applications at an alternate location. Just because we are using cloud doesn’t mean that we should forget all the lessons we have learned over the years in managing IT risks. There are several ways companies can mitigate their risk exposure due to these types of outages. For example one of Kaavo’s customers runs their application across Amazon and Rackspace cloud using Kaavo IMOD; a total outage by one provider wouldn’t bring their application down, their virtual servers at each cloud provider runs around fifty to sixty percent capacity. It is not always advisable to split the application deployment across multiple clouds, e.g. stateful or transactional applications can’t be split due to network latency. Our own Kaavo IMOD application runs on Amazon, we have a backup plan to bring back our entire application and restore the data from the last backup within minutes on Rackspace cloud using a bootstrap instance of IMOD. Using Kaavo System Definition File we have captured all the deployment, runtime management, and data-backup/restore information in one place. Kaavo IMOD takes this information and automatically executes all the steps required to fully restore the application.
Any of the companies who suffered the recent outage could have significantly reduced the downtime and impact of the outage by having a backup plan for automatically restoring their applications. Large complex deployments have several configurations and tasks that have to be performed to fully restore the application, e.g. restoring the database from backup, configuring DNS entries, starting processes in a specific order, etc. During my time in enterprise IT, I have been thru several DR test drills. Almost always, it is very easy to recover the storage servers etc. in the backup datacenters. However, after bringing back the servers, storage, etc. there are always a list of long painful steps of going thru the DR manuals and executing several tasks manually to restore specific applications. At the very minimum, companies using cloud should have a DR plan for restoring their applications at alternate cloud providers. Using Kaavo IMOD the process for application recovery can be fully automated and executed in a short period of time at alternate site/s without human intervention. Please contact us if you want to learn more on how Kaavo IMOD can fully automate application recovery in the cloud.