Cloud Server/network outages are wrecking balls, and it can happen with a dominant market player like the Amazon Web Services (AWS), The latest is an extensive outage brought on by a human error at an AWS data center in Virginia. Many consider it to be the worst hit in four years.
And in June 2016: The storms that battered Sydney also shook AWS services. An extensive power outage led to the failure of a number of Elastic Compute Cloud (EC2) instances and Elastic Block Store (EBS) volumes, many of which hosted critical workloads for big brands. The result was that a number of prime websites and online presence went down for 10 hours on the weekend, hitting businesses severely.
Though infrequent today, the scale and impact of such an outage is large and huge revenue loss to big brands and customers.
To protect your applications running in AWS from disaster, your organization needs a proper Disaster recovery plan.
Disaster Recovery
Disaster Recovery (DR) is a process that helps you prepare for any kind of unwanted disaster. The organization needs a DR plan that could be periodically tested and should scale with the growth of the data. In this article, we will discuss how to achieve DR in AWS.
Disaster Recovery Plan
Disaster Recovery Plan (DRP) is a documented, structured approach with instructions to recover disrupted systems and networks and it helps organizations to run the business as close to normal as possible.
Not all organizations can afford an on-premises disaster recovery plan because it is expensive to maintain and implement it. Companies like Nimesa, Commvault etc. offers Disaster recovery solutions, data protection solutions for AWS specifically. With those products, organizations can afford DRP without taking the hassle of maintaining and implementing it.
While I am developing a Nimesa Disaster Recovery Solution for Amazon AWS Cloud, I have analyzed and found some steps to consider while designing the DR plan with AWS. the following are steps to consider
- Backup your data: Don’t forget to backup your EC2, RDS, ECS Instances ( persistent storage) periodically.
- Cost Saving by deleting old copies: Retention plan is as useful as Backup plan, if you keep taking snapshots and not deleting it, you will end up paying more for backups, proper retention plans like keeping ‘n’ copies of backup and delete the older one.
- Choose the right backup strategy: Example you can backup the EC2 instances by taking the snapshot of attached EBS volumes or create an AMI of instance periodically
- Identify the critical applications: Not all instances need backup, sometimes not all volumes of Instances need to protect.
- Cross region copy: Copying the snapshot to multiple locations ( regions ) periodically and having proper secondary retention plan helps the organization to recover instances from disaster.
- Test DR Plan: Testing the DR plan periodically and verifying the instances are up and running in the secondary site is very useful to avoid surprises at the time of disaster.
Once you have created your organizations’ AWS DRP, its implementation can seem like a daunting task. But third-party services like Nimesa helps you to do backup scheduling, cross-region copy scheduling, delete backups on retention and bring instances from backup on the secondary region.