We are looking to migrate our applications to the cloud. I have read a lot about architecting for resiliency. But has anyone done DR testing in Azure or AWS? What advice can you provide?
Sort by:
Migration of applications to the cloud should be a multi step approach.
DR strategy is based on the public cloud vendor your are choosing and each vendor provide miltiple options based on site/app/VM
.
- Private cloud can be configured based on the capacity and there are multiple technologies which can be used like VMware SRM, Zerto, AWS replication, Azure site recovery and similar services from other hardware vendors.
- DR for on premise environment are also being considered in many scenarios.
Other factors
There are other parameters to review :-
- Very important parameter is RTO and RPO which should be your deciding factor on the choices you make.
- Minimum Network latency is supported for your applications
- Are you migrating the complete application ecosystem to the cloud (database, web, app, middle ware), it is very important to review this analogy.
- Loose dependency on on-premise services like authentication and other services which may not be available in case of any outage.
- Cloud based licenses for your app, database and other instances configured.
- Billing - based on your usage like I/O, capacity specially for analytics
I hope i gave you a glimpse of few action items but there is more to do.
You can greatly increase your availability and reliability but the vast majority of it still relies on solution architecture. Does your solution support the mechanisms used in a cloud environment that give you those characteristics?
You must architect and engineer the technical DR aspects as you move to the cloud. Both Azure and AWS have multi-region, multi-location, and multi-instance awareness built into their services. But you must architect your applications to use them before they are deployed.
No matter the cloud service provider you also have to understand their responsibility model. What will they do in this situation and what do you need to do. For example, a solid backup and recovery strategy is still a mandatory part of any DR strategy. You can't assume the availability will help you in any way if you truly need to perform a recovery.
I would suggest you read through the AWS Well Architected Framework to see how they implement healthy systems. They even have a section for Reliability. Shared Responsibility Model for Resiliency - Reliability Pillar (amazon.com)